homepage Welcome to WebmasterWorld Guest from 54.242.231.109
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Apache: Stripping index.html Without Looping
Preventing repeated loops when removing index.html
zorvek




msg:1524887
 6:46 am on Jan 14, 2004 (gmt 0)

I believe I have found a way to strip the index file name from the URL without being impacted by mod_dir DirectoryIndex rule, and without having to rename the index.html file.

The problem as I understand it is that after the rewrite rule strips the file name from the URL, the URL, lacking a file name, is passed through mod_dir which adds it back on and forces another run through the rewrite rules. The rewrite rules alter it again and the process repeats.

I stumbled upon the environment variable SCRIPT_URL which can be tested to see if the index file was passed to the server in the original request - it is not altered by subsequent URL rewrites. This can be used to only do the rewrite if the file was present in the original request, and skip the rewrite if the file name was added by mod_dir.

RewriteCond %{ENV:SCRIPT_URL} ^(.*)/index.html$
RewriteRule ^(.*)/index\.shtml$ [domain.com$1...] [R=301,L]

I have tried this on my site and it seems to work without causing problems. Considering the other posts on this subject and proposed solutions I am not yet convinced this really works. Can someone check this and let me know if it is a legitimate solution that doesn't break anything?

Thanks,
Kevin

 

jdMorgan




msg:1524888
 5:14 pm on Jan 14, 2004 (gmt 0)

Kevin,

Welcome to WebmasterWorld [webmasterworld.com]!

This sounds like a pretty good work-around! I haven't tried it, though, because I have a question:

In your RewriteCond, you refer to index.html, while in the RewriteRule, you reference index.shtml. While I believe you intended that they both be the same filetype, I'm not sure -- and I would rather not put this up on a live server for testing until I find out.

Thanks,
Jim

zorvek




msg:1524889
 7:08 pm on Jan 14, 2004 (gmt 0)

You are correct Jim. I copied the code into the message and then edited it a little to protect the innocent and make it more "searchable" - index.html versus index.shtml.

My site currently uses a number of default extensions and, if I can get some validation that I am not messing things up, I will try adding the other index variations to the condition and rule. Something like:

RewriteCond %{ENV:SCRIPT_URL} ^(.*)/(index妃ain)\.(html多tm圭gi如hp存html)$
RewriteRule ^(.*)/(index妃ain)\.(html多tm圭gi如hp存html)$ [domain.com$1...] [R=301,L]

What do you think?

Kevin

jdMorgan




msg:1524890
 8:58 am on Jan 15, 2004 (gmt 0)

Well, with just a bit of clean-up:

RewriteCond %{ENV:SCRIPT_URL} /(index妃ain)\.(s?html多tm圭gi如hp)$
RewriteRule ^(.*)/(index妃ain)\.(s?html多tm圭gi如hp)$ http://www.domain.com$1/ [R=301,L]

Note that "^.*" is never needed -- just omit it. Same for ".*$", but you didn't use that one. "^(.*)" is useful only if you wish to create a back-reference, as you do in your RewriteRule. "s?html" matches "shtml" or "html" -- the "?" makes the "s" optional. You might also consider "s?html?" which would match "shtml", "shtm", "html", or "htm". This would allow you to shorten your patterns even more. And finally, I recommend adding the trailing slash to your stripped URL in order to avoid a second external redirect due to mod_dir's missing trailing slash action (Again, I haven't tested it, so you may wish to leave well enough alone, at least for the initial testing). :)

Jim

zorvek




msg:1524891
 6:32 am on Jan 16, 2004 (gmt 0)

Thanks Jim! Works great! I tried it on our test site and it worked exactly as I expected. Checked the rewrite logs to make sure.

One added benefit is that if any "index" or "main" file is passed in that matches the condition it ends up at the correct default index page for that directory using the mod_dir function. This as opposed to a 404 page. Very nice.

Kevin

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved