Forum Moderators: phranque

Message Too Old, No Replies

Need rewriterule for flat url

Rule for subfolders to rewrite to index.php

         

Kim Ludvigsen

4:46 am on Oct 11, 2010 (gmt 0)

10+ Year Member



I want flat URLs and have a rewrite to transform three folders to three arguments:

RewriteRule ^/*([^/]+)/([^/]+)/([^/]+)$ /index.php?year=$1&week=$2&sign=$3 [NC]

This works fine for addresses like domain.com/arg1/arg2/arg3 which rewrites to index.php?year=arg1&week=arg2&sign=arg3

In addition I want any other addresses to rewrite to index.php with no arguments, like:
domain.com/arg1 = index.php
domain.com/arg1/arg2 = index.php
domain.com/arg1/../arg7 = index.php

How can I do that?

sublime1

3:38 am on Oct 12, 2010 (gmt 0)

10+ Year Member



So to rephrase the question, RewriteRule as written should match only when there are exactly three / separated arguments (which it does correctly now); any other URL should go to index.php

If so, I believe all you need to do is 1), add the L flag to the first rewrite rule, and 2) add another rewrite rule that matches any other URL, like:


RewriteRule ^/*([^/]+)/([^/]+)/([^/]+)$ /index.php?year=$1&week=$2&sign=$3 [NC,L]
RewriteRule .* /index.php [L]


But I wonder about things like images, css, javascript that are just files to be served. Also, I think the NC flag is unneeded. In this case, the solution might be:


# transform three argument URL into query string parameters to be passed to PHP
RewriteRule ^/*([^/]+)/([^/]+)/([^/]+)$ /index.php?year=$1&week=$2&sign=$3 [L]
# If the URL does not match a filesystem file, then let index.php handle anything else
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule .* /index.php [L]


All said, is a rewrite the best place to handle this? I would guess that you're already writing some code in your PHP to generate the flat URLs in the first place. If so, why not use PHP, rather than Apache rewrite rules to handle the URL parsing logic? (The $_SERVER and $_REQUEST superglobals have all of the same context as Apache). From a design perspective, you could centralize the code that transforms your URLs to and from flat URLs. Of course my assumption may be wrong :-)

Tom

Kim Ludvigsen

4:02 am on Oct 12, 2010 (gmt 0)

10+ Year Member



You are right, RewriteRule .* /index.php [L] will also redirect images and other files. So no good in this case.

I have read that it would mean less stress on the server to use .htaccess to do this instead of using PHP - is that wrong?

By the way, your second suggestion works like a charm, thanks!

sublime1

2:47 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



Kim --

When it comes to answering questions about "less stress on the server" there's only one way to find out: test it. I have heard that use of .htaccess is very inefficient, especially testing for existence of a file, and when they are big and when they have lots of regular expressions and so on. I say "prove it". I have heard that php is slow. I say "prove it".

<soapbox>
The problem with optimizing for performance before testing is that it creates complexity. Unnecessary complexity just as often leads to problems as it does to solutions to the problems it is intended to solve.

When I was getting going with software engineering, I read a book called Programming Pearls, which I think was the source of a quote: "Code first, optimize later". We have so many conceptions about what is "bad" or "slow" or "wrong" based on ... who knows? I would bet that some very large majority of things people do to optimize (before doing any testing, but based on some suggestion) result in no discernable performance improvement.

And, as frequently, I have seen the same people do silly things that are several orders of magnitude slower (10x, 100x, 1Mx --no, really).

I once worked with a less experienced programmer who would obsess over a line of C++ code for hours until it was as "efficient" as possible, not to mention entirely unreadable. In one instance when he released his code, it took more than an hour to complete. I solved the problem without touching his code -- it used a SQL statement that required an index on a 100M row table, which after I added made the entire thing run in 100 milliseconds.

I then showed him how his code was also wrong (one of his optimizations created a bug) so I rewrote the code in an "inefficient" but readable and correct way. I challenged him to demonstrate that his optimal (now corrected) version was faster. It wasn't. It was exactly the same because the C+ compiler had a built in optimizer that resulted in bit-for-bit identical machine code.
</soapbox>

I would be almost certain that whether you do the work in Apache or PHP is indistinguishable. So do what is most readable, most maintainable, most consistent, etc.

Tom

jdMorgan

3:50 pm on Oct 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A few tweaks for completeness and performance:

# Internally rewrite extensionless three-argument URL to index.php plus query string
RewriteRule ^([^/]+)/([^/]+)/([^/.]+)$ /index.php?year=$1&week=$2&sign=$3 [L]
#
# If the requested URL-path does not resolve to a physically-existing file or to a frequently-requested
# image, CSS, or JS file, then internally rewrite to index.php, removing the query string
RewriteCond $1 !(\.(gif|jpe?g|png|ico|css|js)|index\.php)$
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ /index.php? [L]

Any requests processed by this .htaccess code that include one or more leading slashes on the URL-path should be externally redirected to remove those slashes before doing the internal rewrites. However, if you decide to handle these "SEO-friendly" URLs entirely within PHP this becomes a non-issue, since only the second rule will be needed (to pass all requests for non-physically-extant resources to index.php).

The "exclusion list" in the second rule is a performance tweak. It eliminates doing a filesystem check for the most-frequently-requested filetypes. It is neither necessary nor desirable to try to make this exclusion list comprehensive; Only the most-frequently-requested "page and infrastructure" filetypes need be excluded to see a significant performance gain -- reported here as between 5% and 14% by one member.

Jim

Kim Ludvigsen

3:59 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



Thanks to the both of you. I only have 27 files on the site (24 png, 2 php, 1 css), so I guess the file checking should cause no problem. And with the addition of jdMorgans exclusion list it is even less stress full. All of my text is in a database.