Forum Moderators: phranque

Message Too Old, No Replies

mod rewrite redirect never completes?

         

arikgub

9:58 am on Apr 11, 2008 (gmt 0)

10+ Year Member



I am trying to prevent SEs from indexing the following two URLs as two different pages

http://www.example.com/a/
http://www.example.com/a/index.php

To redirect all requests for a/index.php to a/, I put the following lines in the a/.htaccess file:

RewriteEngine on
RewriteBase /
RewriteRule ^index\.php$ /a/ [R=301,L]

On my local machine it works fine. However when I upload the file to the server, I receive the following error message in the browser:

"Firefox has detected that the server is redirecting the request for this address in a way that will never complete."

I guess this message means that there is an infinite loop of redirects, but how is that?! There is only 1 rewrite rule in the .htaccess file ... Also, I can not spot anything in the .htaccess in the parent directory that redirects 'a/' URL.

On my local apache it works fine. Is there anything in httpd.conf that may cause this kind of behavior?

Thanks

Marcia

12:50 pm on Apr 11, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



On my local apache it works fine

The directives are relative to the root (and your local Apache knows where the root is in your local environment), so to me it looks like it's questionable where exactly /a/ and index.php are in the example, so I'll get brave and give it half a stab.

RewriteEngine on
RewriteBase /
RewriteRule ^index\.php$ /a/ [R=301,L]

This is how I have it (just for the homepage, using .html extension) - using the full URL in the RewriteRule:


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.example.com/ [R=301,L]

But I do it just for the homepage, so I suppose there would be a difference and more would have to be coded (with an expression using a back-reference) for the RewriteRule to work with all subdirectories.

Added:

Aha! Here's where I probably swiped mine from originally, and there's a full explanation of why THE_REQUEST is used:

[webmasterworld.com...]

There's plenty more:

subdirectory redirect index.html [google.com]

[edited by: Marcia at 1:04 pm (utc) on April 11, 2008]

arikgub

2:51 pm on Apr 11, 2008 (gmt 0)

10+ Year Member



Marcia,

Many thanks, that solved my problem. I only changed the rule slightly to apply in the directory a/, but not in the sub-directories.


RewriteEngine on
RewriteBase /
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.php\ HTTP/
RewriteRule ^index\.php$ /a/ [R=301,L]

I guess using THE_REQUEST was the key. I still do not understand though "the loop" thing. How was /a/ redirected back to /a/index.php?

Thanks again

jdMorgan

3:14 am on Apr 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Because there's probably a directive in .htaccess or the server config file similar to

DirectoryIndex index.html index.htm index.php

which will internally rewrite any request for "/" at any directory level to the first existing file in the list that can be found.

So, your browser requested "/", DirectoryIndex rewrote it to "/index.php", and then your rewrite rule redirected the browser to "/" and the loop repeated.

By examining the browser request, your rewrite rule will redirect only if the client asked for "index.php" and not if the current URL is index.php as the result of a DirectoryIndex directive or an internal rewrite rule.

You can make the RewriteCond a bit more efficient by eliminating the ambiguous ".*" subpattern:


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/

The replacement subpattern says, look for one or more characters which is not a slash, followed by a slash, and as many of those sequences as you like -- including zero. This saves the parser a lot of work, since it now knows precisely when to stop looking for subdirectory path info, and when to start looking for "index\.php\ HTTP/".

Jim

g1smd

6:48 pm on Apr 12, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Do you only want to redirect for the index file in the /a/ folder?

Why not redirect for all folders?

It takes just a few extra characters in the existing rule to do that.

arikgub

1:57 pm on Apr 13, 2008 (gmt 0)

10+ Year Member



jdMorgan, thanks for the explanation and the regex optimization suggestion. I'll use it.

g1smd, I believe it is a good idea to redirect in sub-directories too, but in my case, I have a few very well G ranking index.php URLs in the subdirs, and I would not like to tempt fate.

jdMorgan

3:35 pm on Apr 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



To handle /index.php files in all subdirectories, remove the code from example.com/a/.htaccess, and put the following into example.com/.htaccess :

RewriteEngine on
#
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.php\ HTTP/
RewriteRule ^(([^/]+/)*)index\.php$ http://www.example.com/$1 [R=301,L]

I'd consider it "tempting fate" to *not* fix all /index.php-versus-slash duplication if you do have multiple index.php file in subdirectories on the site.

Make sure that every page on your site can be reached with one and only one URL; All other URL variations should be redirected to the single, canonical URL.

Look for problems with:

  • example.com vs. www.example.com
  • example.com vs. example.com., example.com:80, and example.com.:80
  • example.com vs. any other "vanity domains" you may have which resolve to the same site
  • http vs. https
  • /index.xyz versus "/"
or any others.

Jim