Forum Moderators: phranque

Message Too Old, No Replies

.htaccess question

Redirect to slash for existing directories only

         

Torontonian

7:53 pm on Sep 6, 2009 (gmt 0)

10+ Year Member



I'm working on a new website, for which I'm using index.shtml files in most directories (including the root) and redirecting them to a slash, i.e.,

http://www.example.com/index.shtml
redirects to http://www.example.com/
http://www.example.com/contact/index.shtml
redirects to http://www.example.com/contact/
etc.

Right now, the relevant part of my .htaccess code looks like this:

RewriteEngine on
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /(([^/]+/)*)index\.shtml\ HTTP/
RewriteRule index\.shtml$ http://www.example.com/%1 [R=301,L]

The thing that's bothering me is that, with the above code, even index.shtml files in non-existent directories would be redirected, e.g.

www.example.com/qwerty/blablabla/index.shtml
would be redirected to
www.example.com/qwerty/blablabla/
before ultimately displaying a 404 error.

Redirecting and then displaying a 404 doesn't seem "right" to me. The only alternatives I can think of are (a) to have a separate .htaccess file in each directory where I want index.shtml redirected to slash, or (b) to add a line to the root .htaccess whereby it redirects to slash only after verifying (somehow) that the particular index.shtml file exists. I'm wondering which of these two methods (or perhaps another one that I haven't thought of) would be preferable with respect to search-engine friendliness, site-loading time, and best practices for web programming in general.

Any advice would be greatly appreciated.

[edited by: jdMorgan at 3:08 am (utc) on Sep. 7, 2009]
[edit reason] example.com [/edit]

jdMorgan

3:06 am on Sep 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The purpose of this kind of rule is to 'fix' search engine listings where someone has linked to the index-page 'file' instead of linking to the directory-index URL. And this presumes that the index page 'filename' got 'exposed' somehow to search engines and/or visitors in the past.

So the presumption of this type of rule is that the directory and/or index file does exist.

However, if you wish to check this before redirecting, it's a simple matter of using a RewriteCond to check for file exists. But these 'exists' checks require a call to the operating system, and if the current filesystem state is not cached, will also require a read of the physical disk. Both of these operations consume CPU time, and this is another reason you won't see this checking done very frequently.

As in the code below 'exists checks' (and reverse-DNS lookups as well) should always be the last-possible RewriteCond and the RewriteRule pattern should be as specific as possible, so that these 'inefficient' checks/lookups are not executed unless the RewriteRule pattern and all other RewriteConds match.


RewriteEngine on
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index\.shtml(\?[^\ ]*)?\ HTTP/
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(([^/]+/)*)index\.shtml$ http://www.example.com/$1 [R=301,L]

Jim

g1smd

8:08 am on Sep 7, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If the URL that returns 404 has never existed, and some other URL redirects to it, I see it unlikely to cause a problem.

Torontonian

4:35 pm on Sep 7, 2009 (gmt 0)

10+ Year Member



Thank you both for the informative answers!