Forum Moderators: phranque

Message Too Old, No Replies

Multiple rewriterules

Multiple rewriting rules don't apply, only first one does.

         

larry_r

9:11 pm on Sep 14, 2010 (gmt 0)

10+ Year Member



Hi all,

I'm new to this forum, so here's my first question.
I have three rules in my .htaccess and only one of them seems to work:

RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&id=$2
RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&cat=$2
RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&sort=$2 [L]


So in essence if I have, say, www.site.com/somepage/123 this will translate correctly to www.site.com/?p=somepage&id=123 but the latter two rules won't work - why? How can I remedy this situation?

Here's my whole .htaccess file if it helps any. Everything up until the "cat" and "sort" translations works beautifully.


RewriteEngine On
Options -Indexes
Options +FollowSymLinks
RewriteRule ^admin/$ /admin/index.php
RewriteRule ^(.*)/$ index.php?p=$1
RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&id=$2
RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&cat=$2
RewriteRule ^([a-zA-Z0-9]+)/([a-zA-Z0-9]+)$ index.php?p=$1&sort=$2 [L]

larry_r

9:31 pm on Sep 14, 2010 (gmt 0)

10+ Year Member



Oh and I tried using the [L] flag on all RewriteRules. Didn't work.

If I call the urls the ugly way, ie. www.site.com/?p=somepage&cat=goggles, things work like they should but doing www.site.com/somepage/goggles just won't cut it.

g1smd

10:31 pm on Sep 14, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The pattern is looking at the URL request.

If the URL contains "thing" but all three patterns match "thing", the first pattern will match, the rewrite will be performed and the other patterns will never be tested.

You need to think about your URLs, and there needs to be something about each of the URL types that is different, such that you have three different patterns and for any particular URL request only one of those patterns can ever match.

One site I worked on used
example.com/s34
for section numbers,
example.com/p3456
for pages,
example.com/r3456
for reviews,
example.com/n567
for news, and so on.

The RegEx patterns specifically tested the first character to work out what type of content was to be returned.

You are correct in that you will need the [L] flag on every rule.

larry_r

6:11 am on Sep 15, 2010 (gmt 0)

10+ Year Member



Thank you!
I didn't come to think that the regexps had to be distinct, so I changed all a bit because ID's are always numeric, Cats always alphabetic and so on.

Now then..

the next problem I'm facing is to do with the trailing slashes.


RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/$ /$1/$2/ [R=301]
RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ index.php?p=$1&cat=$2 [L]


This, for some reason, doesn't work even though the regexps are not identical, ie. I can access the page with "www.somepage.com/something/else" but not with "www.somepage.com/something/else/"

Any ideas?

g1smd

7:17 am on Sep 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The trailing slash indicates a folder, so for pages omit the trailing slash.

The rule does not work because you are redirecting to a path that will match the rule again and will be redirected again in an infinite loop.

For that rule add the protocol and hostname to the target and omit the trailing slash from the target. Add the [L] flag to that rule.

larry_r

7:28 am on Sep 15, 2010 (gmt 0)

10+ Year Member



Could you please elaborate or provide an example? I'm afraid I don't quite understand :(

I tried the following:
RewriteRule ^http://www.myserver.com/([a-zA-Z]+)/([a-zA-Z]+)$ /$1/$2/ [L]
RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)$ index.php?p=$1&cat=$2 [L]

but that didn't work. Obviously I didn't quite get what you meant?

jdMorgan

4:47 pm on Sep 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Your first try at your new first rule allows subsequent rules to expose your script filepaths as URLs because of the missing [L] flag. If the [L] flag were present, the rule would redirect any requested URL ending in a slash to itself, creating an infinite redirection loop:

RewriteRule ^([a-zA-Z]+)/([a-zA-Z]+)/$ /$1/$2/ [R=301]

If the purpose of this first rule is to add a trailing slash, then your pattern should not be requiring that slash to already be present. The pattern "on the left side of the rule" must match the URL-path requested by the client (e.g. a browser or search robot). That URL-path will then be rewritten to the filepath or redirected to the URL (depending on which syntax you specify) "on the right side of the rule."

The protocol and hostname are never visible to rewriterule, and therefore your new version won't ever match anything:

RewriteRule ^http://www.myserver.com/([a-zA-Z]+)/([a-zA-Z]+)$ /$1/$2/ [L]

Please don't try to guess at this stuff, because you risk the 'health' and search rankings of your site by doing so. This is server configuration code, and must be exactly, precisely, correct; it is not to be trifled with. If you have not done so already, spend some time with the documentation cited in our Apache Forum Charter.

Also, comment your code in some meaningful way. It not only tells us what you intend your rules to do, but it will remind you of your own intent when you look at this code again after, say, two years...

Hopefully, this will get you closer, although the 'fix' for the 'sort URLs' is just my best-guess at your intent:


Options +FollowSymLinks -Indexes
RewriteEngine on
#
# Externally redirect to add missing trailing slash to all requested extensionless
# URL-paths of the form /<letters> or /<letters>/<letters-and-or-numbers>
RewriteRule ^([a-z]+(/[a-z0-9]+)?)$ http://www.example.com/$1/ [NC,R=301,L]
#
# Externally redirect requests for non-canonical hostname to canonical hostname
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
# Internally rewrite requested extensionless page/sort URL-paths of the
# form /<letters>/<specific-sort-type>/ to the script filepath
RewriteRule ^([a-z]+)/(price|size|color|ascending|descending)/$ index.php?p=$1&sort=$2 [NC,L]
#
# Else internally rewrite requested extensionless page/category URL-paths
# of the form /<letters>/<letters>/ to the script filepath
RewriteRule ^([a-z]+)/([a-z]+)/$ index.php?p=$1&cat=$2 [NC,L]
#
# Internally rewrite requested extensionless page/id URL-paths
# of the form /<letters>/<numbers>/ to the script filepath
RewriteRule ^([a-z]+)/([0-9]+)/$ index.php?p=$1&id=$2 [NC,L]
#
# Internally rewrite requests for admin page URL-path to script filepath
RewriteRule ^admin/$ /admin/index.php [L]
#
# Else rewrite all remaining requested single-directory-level extensionless
# URL-paths of the form /<letters-and-or-numbers>/ to the script filepath
RewriteRule ^([a-z0-9]+)/$ index.php?p=$1 [NC,L]

Note the use of the [NC] flag (No Case) to make the pattern match case-insensitive and so double the performance of each rule.

Frankly, I'd "go the other way" and remove those unnecessary trailing slashes unless the requested URL-path resolves to a physically-existing directory. Otherwise, you're just making each of your URLs one character longer than necessary and making them harder to read and to pronounce (for example, on the phone or on radio).

Once you get all of the above rules working, you may wish to add code to externally 301-redirect any direct client requests for your script-filepaths (as your old URLs) to the new extensionless URLs. This may speed up search engines re-indexing your site using the new URL format, but should not be done until all the code above is finished, tested, and working and all links on your own site to "index.php?<query-string>" URLs are updated to the new format.

Jim

larry_r

8:27 pm on Sep 15, 2010 (gmt 0)

10+ Year Member



Jim,

thank you for your generous advice! Regarding documenting the rules - I have done that but removed it from my posting for the sake of clarity. I guess I should have left them so as to be more clear for you others.. :) I am quite anal about documentation otherwise.

My intent is not to rewrite all rules so that they explicitly require a trailing slash, rather, to make it easier for users to use whichever syntax, ie. so that both www.myserver.com/something/else and www.myserver.com/something/else/ would point to the same direction, and neither would throw errors at anyone. If someone feels a need to use a trailing slash then, heck, let them see it with a trailing slash. If not, same thing.

After g1smd:s comments, I got my rules working like they should. My next issue at hand was the trailing slash problem.

So, to elaborate myself so as to hopefully help you help me:

* All rules have been tried and tested, and work as supposed.
* Only problem is the possible presence/absence of the trailing slash.

I don't want to add or remove it, just make sure that users see content whichever way the try to access a page with.

The way I had seen this done was by using the 301 redirect, but I couldn't get my example working.

I hope these pointers would give you a clue of what I'm going at.

Thanks in advance!

g1smd

10:01 pm on Sep 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I don't want to add or remove it, just make sure that users see content whichever way the try to access a page with.

The correct operation should be:
- if user requests URL with trailing slash, server sends a 301 redirect to the correct URL without trailing slash.
- if user requests URL without trailing slash, server sends the correct content with a 200 OK response code.

The links on the site should point to URLs without a trailing slash.