Forum Moderators: phranque

Message Too Old, No Replies

Redirecting code check

checking code for redircts and extentionless URLs

         

hburlingameiii

10:49 pm on Oct 31, 2009 (gmt 0)

10+ Year Member



Hi,

I am new to making websites and am making my first website (just a small personal website), but I am trying to do it right the first time around. I am trying to write my htaccess file to avoid any duplication and also to have extensionless urls in case i change from html later on. I have been reading these forums and the following is what I came up with. It seems to be working, but I was hoping that one of you experts could look it over and make sure I am not doing anything wrong (i.e. nothing superfluous, no typos, everything in the correct order, etc.), as I am definitly no expert (just learned that there even was such a thing as an htaccess file today). Also, is there anything I should add to this file? I want to get all this kind of stuff sorted out before I start coding the website.

Thanks!
HB

RewriteEngine on
RewriteBase /

#
# Redirect direct client request for URL with .html extension
# to new extensionless URL if the .html file exists
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/\ ]+/)*[^.\ ]+\.html\ HTTP/
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(([^/]+/)*[^.]+)\.html$ http://www.example.com/$1 [R=301,L]

#
# Redirect any request for a URL with a trailing slash to extensionless URL
# without a trailing slash unless it is a request for an existing directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ http://www.example.com/$1 [R=301,L]

#
#Redirect index to root
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\ HTTP/
RewriteRule ^index$ http://www.example.com/ [R=301,L]

#
# Redirect parked domain and non-www to www main domain
RewriteCond %{HTTP_HOST} ^(www\.)?myparkeddomain\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

#
# Internally rewrite extensionless URL request
# to .html file if the .html file exists
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.html [L]

[edited by: jdMorgan at 2:50 am (utc) on Nov. 1, 2009]
[edit reason] example.com [/edit]

g1smd

10:59 pm on Oct 31, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You'll need at least one small change.

Redirect both index.html and index directly to /

That is, don't chain those redirects.

Take your existing index redirect and move it to be the very first rule. Make it operate for both index.html and index requests.

Additionally, does it need to work for index in folders, or just in the root? I'd expand it to work for all folders.

.

The four lines of code under this comment...

# Redirect parked domain and non-www to www main domain

can probably be simplified to...

RewriteCond %{HTTP_HOST} [b]!^www\.e[/b]xample\.co[b]m$[/b]
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

I haven't looked at all of the code, so might have other comments later.

[edited by: jdMorgan at 2:51 am (utc) on Nov. 1, 2009]
[edit reason] example.com [/edit]

hburlingameiii

12:13 am on Nov 1, 2009 (gmt 0)

10+ Year Member



Thanks g1smd, I replaced the four lines with your simplified version and it works fine. Also, it is just a small website so I only need to redirect the index in the root, there are no folders.

Also, how would I modify the code to redirect both index.html and index to /? I'm trying to learn what all the regular expressions are etc. but am still a beginner.

Thanks again!
HB

jdMorgan

2:49 am on Nov 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I strongly recommend tweaking that domain canonicalization redirect to

RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]

in case an HTTP/1.0 client sends a request to your server -- you don't want an infinite redirection loop to result if the client is incapable of sending the HTTP "Host:" header (HTTP_HOST will be blank)...

A general rule to redirect all "index" requests to "/"


# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]

Jim

hburlingameiii

5:42 pm on Nov 1, 2009 (gmt 0)

10+ Year Member



Thanks Jim, here is what I have now:

RewriteEngine on
RewriteBase /
#
# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]
#
# Redirect direct client request for URL with .html extension
# to new extensionless URL if the .html file exists
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/\ ]+/)*[^.\ ]+\.html\ HTTP/
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^(([^/]+/)*[^.]+)\.html$ http://www.example.com/$1 [R=301,L]
#
# Redirect any request for a URL with a trailing slash to extensionless URL
# without a trailing slash unless it is a request for an existing directory
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.+)/$ http://www.example.com/$1 [R=301,L]
#
# Redirect parked domains and non-www to main domain for domain canonicalization
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule ^(.*)$ http://www.example.com/$1 [R=301,L]
#
# Internally rewrite extensionless URL request
# to .html file if the .html file exists
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule ^(([^/]+/)*[^./]+)$ /$1.html [L]

Does this look good? And is there anything else I should watch out for?

Thanks again,
HB

g1smd

6:50 pm on Nov 1, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think it looks OK.

It needs to be tested with a large number of both valid and non-valid URL requests: for root and folder URLs, with and without www, with and without port numbers, with and without .html extensions, and with and without parameters (valid or not), etc.

Try as many permutations as possible and check them all for correct operation, correct HTTP response codes, and especially ensure no chains in any redirections.

hburlingameiii

8:10 pm on Nov 1, 2009 (gmt 0)

10+ Year Member



Sounds good g1smd. How exactly can i check them for correct operation, response codes, etc.? Is there a log file or something I can look at?

Thanks again,
HB