Forum Moderators: phranque
I'm trying to set up redirects for www cannonical issues, plus bogus urls plus redirect index.htm to index.html
the cannonical issue is working, but none of the others. They all throw 404 errors when they should throw a 410 or redirect to the home page.
This site is in a sub domain of the main site so I'm wondering if that is the problem or is there something wrong with the code below.
Also do i have the items arranged in the right order?
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{HTTP_HOST} ^EXAMPLE\.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.html\ HTTP/
RewriteRule ^(([^/]+/)*)index\.html$ http://www.example.com/index.html [R=301,L]
RewriteRule ^index\.htm$ http://www.example.com/index.html [R=301,L]
Plus, your second rule appears to be self-defeating, and creates an 'infinite' redirection loop; The original purpose of this code was to redirect direct client request for /index.html to "/". As shown above, it redirects /index.html to /index.html, and creates a loop.
You should always link to and refer to your home page as "/" and the only mention of index.html should be in your (currently) second rule and in the DirectoryIndex directive of your .htaccess file(s).
Unless it is occurring as a side-effect of the 'infinite' redirect loop, I cannot explain the 404 error -- Perhaps you've declared a custom 500-Server Error ErrorDocument page, but that page does not exist? Look at your server error log file, and consider installing and using the "Live HTTP Headers" add-on for Firefox/Mozilla browsers; It will allow you to see the HTTP headers exchanged between your server and your browser, and may help to debug this problem.
Jim
Here is why I was using the 2nd condition/rule in my post above:
[webmasterworld.com...]
I assumed if it was good for one site to catch all sorts of bogus URLs it would be good for another.
The 3rd rule is needed because the owner of the site renamed the files from .htm to .html
If you care about your search engine rankings, all "bogus" URLs should result in a 404-Not Found or 410-Gone response, leading to a nice helpful error page that explains the situation and contains text links to your home page, site map, and site search page, as applicable.
Only the non-canonical forms of *valid* URLs should be redirected to the canonical form of the URL, e.g. redirect /index.htm and index.html to "/".
Catching *all* bogus URLs and redirecting them to your home page will result in an infinite URL-space on your site, and potentially-massive duplicate-content problems.
Jim