Forum Moderators: phranque
I tried to redirect index.html to domain.com, but I get some kind of loop. How do I do this? Should I use mod rewrite somehow? Or redirects?
There are a couple of different methods I've found including the one jdmorgan points to above for the index.html to /. I have a question and if someone could elaborate I'd appreciate it.
From the link above-
--------------
1)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ /$1 [R=301,L]
I'm using this as it seems to work for me so far
--------------
2)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ $1 [R=301,L]
I tried this first, but it caused index.html to be not found instead of redirecting
--------------
From another thread on WebmasterWorld
--------------
3)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html?
RewriteRule ^index\.html?$ [yoursite.com...] [R=301,L]
I haven't tried as the first one above appears to be working
--------------
Question:
Is #1 the best option and are there any issues/potential issues I should be aware of?
Thanks.
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ [b]http://www.example.com[/b]/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.example.com/ [R=301,L]
It was a bit disconcerting so I removed it for now. I probably won't have time to mess with it until after the US holidays. I really wanted this to work. Maybe I should just rewrite index.php to index.html and live with there being an index.html out there. Does google, et al. treat www.example.com/ and www.example.com/index.html as 2 separate pages?
I haven't gotten any bots except yahoo past the root for the domain since I implemented this. So, if I see a 301 code for / the bot had requested index.html? If I request index.html I see a 301 for index.html and a 200 for /. It seems like that's what the bot should get, too.
Sometimes I'm seeing 301s for robots.txt, too. I don't think the site/host is unreliable. Also, MSNbot only hits robots and goes away though it is not excluded (or mentioned) in my robots.txt.
It is entirely up to the client 'bot or browser as to how it wants to respond to a server 301 response. Note the terms -- server and client -- because it is the client who is 'in charge' and the server does the client's bidding. So, it is up to the 'bot to decide if it wants to follow the 301 now, or put it in a queue for later processing. Without seeing raw logs or knowing whether you've got canonicalization issues, it's hard to tell what's happening with your site. All I can really say is that if a client requests index.html from any domain that resolves to this .htaccess file's directory, it will get a 301 to "www.example.com/" and any issues outside of that need to be handled separately.
To rule out canonical domain redirects as a cause of spider 'pauses', you could try modifying the code (from my second example, here) so that it won't ever change the requested domain:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://%{HTTP_HOST}/ [R=301,L]
Jim