Forum Moderators: phranque

Message Too Old, No Replies

Simple mod rewrite not quite working

         

quack

7:54 pm on Jan 11, 2010 (gmt 0)

10+ Year Member



I'm trying to reduce the size of my htaccess file but am missing something that I know is simple.

Below is a sample of what I have now. If I try to use the regex that I think should work it returns zone=index.php

Currently the htaccess file is about 300 lines and I'm pretty sure i can get rid of most of that but simply can't get it straight.

Any help is appreciated.


RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
RewriteRule ^Africa index.php?zone=Africa [NC,L]
RewriteRule ^North-America index.php?zone=North-America [NC,L]
RewriteRule ^Asia index.php?zone=Asia [NC,L]
RewriteRule ^Caribbean index.php?zone=Caribbean [NC,L]
# etc.

[edited by: jdMorgan at 10:22 pm (utc) on Jan. 11, 2010]
[edit reason] example.com [/edit]

jdMorgan

10:24 pm on Jan 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Please post the modified code that does not work if you want to ask about problems with that code.

Jim

quack

10:59 pm on Jan 11, 2010 (gmt 0)

10+ Year Member



I didn't post what didn't work because I though it was best to show what I wanted to accomplish. I think that what I have is way off. I'm not sure if the problem is because of the use of hyphens or a conflict between rewrite rules or that I'm clueless about regex but here it is...


RewriteEngine On
RewriteCond %{HTTP_HOST} !^www\.example\.com [NC]
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
RewriteRule ^(.*) index.php?zone=$1 [NC,L]

jdMorgan

11:19 pm on Jan 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The first rule is intended to externally redirect non-canonical hostname requests to the canonical hostname -- but note that the [NC] flag defeats some if its utility and should be removed.

The second rule unconditionally rewrites *all* requests to /index.php, appending the originally-requested path as the value of the "zone" parameter. But note that it will also rewrite the server's own request for "index.php?zone=Africa" to "index.php?zone=index.php"

So in simple terms, you've got an "infinite loop" there.

Modify the code like this:


RewriteEngine on
#
RewriteCond %{HTTP_HOST} !^www\.example\.com$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
RewriteCond $1 !^index\.php$
RewriteCond $1 !\.[a-z]{2,}[2-9]?$
RewriteRule ^(.*)$ index.php?zone=$1 [L]

The first exclusion RewriteCond on the second rule now prevents index.php from being rewritten to itself, and I show an even-more-general example RewriteCond to prevent any URL-paths having appended "filetypes" from being rewritten as well.

You need to carefully assess your sites' URL-space and decide which resources you do wish the script to handle, and which you do not. For example, it's unlikely that your script will be used to handle requests for robots.txt, sitemap.xml, image, CSS, or JavaScript files, etc.

Jim

[edit] Removed unneeded [NC] flag from second rule as noted below. [/edit]

[edited by: jdMorgan at 2:24 pm (utc) on Jan. 12, 2010]

quack

2:14 pm on Jan 12, 2010 (gmt 0)

10+ Year Member



That worked!

Thanks much Jim - there's no way I would have gotten that.

quack

2:20 pm on Jan 12, 2010 (gmt 0)

10+ Year Member



OK - another question...

I'm not sure what you mean by


You need to carefully assess your sites' URL-space and decide which resources you do wish the script to handle, and which you do not. For example, it's unlikely that your script will be used to handle requests for robots.txt, sitemap.xml, image, CSS, or JavaScript files, etc.

I assumed that because I didn't understand it that it doesn't apply to me - I don't plan on have any references to or restrictions on robots.txt, sitemap.xml, image, CSS, or JavaScript in the htaccess file other than caching. I'm not sure where to start reading to understand this (?)

Thanks again

jdMorgan

2:23 pm on Jan 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In case it's not obvious, you don't need the first RewriteCond on the second rule if the code 'works' on your site with the second RewriteCond in place. The first RewriteCond specifically prevents index.php requests from being rewritten, while the second prevents requests for *any* URL-path with a 'filename' on the end --including index.php-- from being rewritten.

Also, I noticed that I left an unneeded [NC] on the second rule, which should be removed to prevent wasting CPU resources.

Jim

jdMorgan

2:33 pm on Jan 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The question is simply this: Is your index.php script set up to serve client requests for robots.txt, sitemap.xml, image, CSS, and JavaScript files, along with proper server response headers?

If so, then exclude only index.php from the second rule. If not, then exclude all requests having filetypes appended (if possible with your current site URL-architecture), or expand the first RewriteCond pattern to exclude only those resources which your script cannot or should not handle.

This is a fairly critical decision here, and may have long-term effects on the future URL-architecture possibilities and ranking performance of your site. If you don't understand something, then the safe approach is to assume that it *does* apply to you, not that it doesn't.

These are aspects of server configuration that you need to know -- and that you should master unless you want to spend a lot of money on consulting and 'repairs'. This ain't simple, and there's no free lunch.

Jim

quack

2:34 pm on Jan 12, 2010 (gmt 0)

10+ Year Member



Great - thanks again!