Welcome to WebmasterWorld Guest from 54.160.177.33

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

.htaccess redirects for index.html, index.php and index.htm to /

     
5:34 am on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


I have been reading many different opinions and codes for redirecting all of my index files to /. Most of my pages end in index.htm, but some have index.html and index.php.

From what I have read, it is considered duplicate content if you do not redirect all of the index files in all folders/directories to /?

I have seen this used in .htaccess for redirects:

RewriteCond %{THE_REQUEST} ^.*/index\.html
RewriteRule ^(.*)index.html$ http://www.mydomain.com/$1 [R=301,L]


Is this correct and do I need to do this for index.htm and index.php seperately or is there a way to put this all together in one redirect?

I currently only have this in my .htaccess file:

RewriteEngine on

RewriteCond %{HTTP_HOST} ^mydomain\.com$ [NC]
RewriteRule ^(.*)$ http://www\.mydomain\.com/$1 [R=301,L]


I know that there are many discussions on this, but I can not find one that deals specifically with index.html, index.htm and index.php all together? Like I said earlier most of my files have index.htm, but I also have index.html and index.php files that I want to redirect. I am not even sure that I should be redirecting, but I don't want to have duplicate content problems. This is way over my pay grade.
5:53 am on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


I just found this .htaccess code:

RewriteEngine on

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.{htm|html|php)\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(htm|html|php)$ http://example.com/$1 [R=301,L]


So, if the code above is correct, this is what my .htaccess file will look like:

RewriteEngine on

RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
RewriteRule ^(.*)$ http://www\.example\.com/$1 [R=301,L]

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.{htm|html|php)\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(htm|html|php)$ http://example.com/$1 [R=301,L]
6:04 am on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


That last code did not work, but this one did:

RewriteEngine on

RewriteCond %{HTTP_HOST} ^website\.com$ [NC]
RewriteRule ^(.*)$ http://www\.website\.com/$1 [R=301,L]

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.website.com/$1? [R=301,L]


Before I use this permanently, is this a good idea and what am I going to be in for with google when I make this change to my entire site?
6:13 am on Aug 16, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12704
votes: 244


#1. Dump the

^.*

anywhere it occurs. If you're not capturing it, you don't need it.

#2. By the usual jaw-dropping, mind-boggling coincidence I have just this minute finished reading a post in this very Forum about redirecting to index files. Putter around and you will find the exact wording for your htaccess. It should be only a few posts away from this one.

Yes, you can combine extensions. You don't even need to be specific about them; if someone comes along and requests /index.bzzt, you can redirect them right along with everyone else. (ONLY for redirects! When rewriting, accept only canonical URLs.)

So instead of {blahblah}index\.(php|html?) you could simply say index\.\w+ It seems safe to say that if someone asks for "index." alone, or "index.one-two-three" they are probably up to no good and can be safely 404'd. You gotta draw the line somewhere. Your RewriteCond can be even more minimalist and just say

RewriteCond %{THE_REQUEST} index

because the more detailed format is already in the Rule. You just need to exclude the ones that asked for something containing "index". That's assuming you have not goofed horribly by having directory names or query terms that also contain the string "index". If they do, the Condition may have to be more exact.
6:24 am on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


I have also come across these:

RewriteCond %{THE_REQUEST} ^.*\/index\.html? 
RewriteRule ^(.*)index\.html?$ http://www.domain.com/$1 [R=301,L]

OR

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html?\ HTTP/
RewriteRule ^(.*)index\.html?$ http://www.domain.com/$1 [R=301,L]



So basically, what is the best course of action? This can get really confusing and I just want to do the best thing. Thanks!
6:31 am on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


Also, once I get this right, can I keep linking to the index.htm files instead of the folders when I am building my site? It is much easier for me to manage my links when working on the website when linking to index.htm etc. instead of just /.
8:54 am on Aug 16, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12704
votes: 244


Goodness. I don't think I even saw your in-between posts while composing mine.

You can link to "index.htm" for development purposes, so long as you remember to delete all of them before uploading the files to the real www space. If it's a big site, you may need to locate a doodad that does this globally without having to open every file. (Practice on a backup first!)

Or you can install a pseudo-server like MAMP or WAMP; then everything will work just like on a "real" site. That also includes the, er, opposite end of any link: the part where you generally want to say /directory/overhere rather than ../../directory/overhere. Site-absolute rather than relative links. Not only for other pages but for stylesheets, includes and similar.
1:30 pm on Aug 16, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Any code with .* at the beginning or in the middle of a pattern can and should be changed to be more "specific".

The order of your redirects is important. The index redirect must be before the non-www/www redirect otherwise you introduce an unwanted multiple step redirection chain when a non-www index URL is requested.
4:47 pm on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


Thanks Lucy24 and g1smd! It is a large site and I have found a "doodad" and have already changed globally to / but have not uploaded. I am still trying to get the redirect correct and honestly I am confused about the fine details and the (.*).

I have this now:

RewriteEngine on

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.website.com/$1? [R=301,L]

RewriteCond %{HTTP_HOST} ^website\.com$ [NC]
RewriteRule ^(.*)$ http://www\.website\.com/$1 [R=301,L]


#1. Dump the ^.*



Yes, you can combine extensions. You don't even need to be specific about them; if someone comes along and requests /index.bzzt, you can redirect them right along with everyone else. (ONLY for redirects! When rewriting, accept only canonical URLs.)

So instead of {blahblah}index\.(php|html?) you could simply say index\.\w+ It seems safe to say that if someone asks for "index." alone, or "index.one-two-three" they are probably up to no good and can be safely 404'd. You gotta draw the line somewhere. Your RewriteCond can be even more minimalist and just say

RewriteCond %{THE_REQUEST} index


I am not sure how to change Lucy24 suggestions above. Could someone give me an example of the code above changed with Lucy24 suggestions. Thanks!
7:14 pm on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


This is all I want to do, nothing more - using .htaccess:

Here's a few examples:
http://mydomain.com
http://mydomain.com/index.html
http://mydomain.com/index.htm
http://mydomain.com/index.php
http://www.mydomain.com/index.php
http://www.mydomain.com/index.html
http://www.mydomain.com/index.htm
all redirect to:
http://www.mydomain.com/

http://mydomain.com/subdir
http://mydomain.com/subdir/index.html
http://mydomain.com/subdir/index.htm
http://mydomain.com/subdir/index.php
http://www.mydomain.com/subdir/index.php
http://www.mydomain.com/subdir/index.html
http://www.mydomain.com/subdir/index.htm
all redirect to:
http://www.mydomain.com/subdir/


I am just looking for the best code to do this and I have seen it done a bunch of different ways in this forum with many differing suggestions. I can not seem to understand the specifics and I need to get it right as I have changed my entire site to point to / only and I am waiting to upload until I get this right. Thanks!
7:53 pm on Aug 16, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


I use something similar to this:

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z](3,9)\ /([^/]+/)*index\.[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]


# Redirect non-canonical hostname to www
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


(\.[a-z0-9]+)?[^\ ]*\ HTTP/
is "period followed by some characters followed by stuff that is not a space then HTTP/" and simplifies to
\.[^\ ]*\ HTTP/


Don't redirect only "example.com", as you miss redirecting "www.example.com:80"

The NC flag is unwanted.

(.*) captures everything, so anchoring is not required.

Do not escape periods in rule target.
10:07 pm on Aug 16, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


Thanks g1smd!

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z](3,9)\ /([^/]+/)*index\.[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]


Not sure why, but this code did not redirect for me when I tested it.

The following code did, but I am not sure if it needs to be tweaked or not:

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]
10:37 pm on Aug 16, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Typo! Wrong type of brackets: (3,9) should have been {3,9}

Take the time to learn the RegEx syntax. This stuff pops up all the time.
2:56 am on Aug 18, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


I am working on learning more RegEx syntax, but the code is still not working and I can't figure it out.

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]


This is what I am currently using:

RewriteEngine on

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]+\ /([^/]+/)*index(\.[a-z0-9]+)?[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index(\.[a-z0-9]+)?$ http://www.example.com/$1? [R=301,L]

# Redirect non-canonical hostname to www
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


Will the code that I am currently using slow down page loads? It really seems to be slowing down page loads on pages that should be in the cache. Also, how long do I need to leave this redirect up after the change from index.html to / ?
6:47 am on Aug 18, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


the code is still not working

precisely how is it "not working"?

how long do I need to leave this redirect up after the change from index.html to / ?

as long as your server is receiving requests for index.html you will need a redirect to the canonical url.
8:50 am on Aug 18, 2012 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12704
votes: 244


This element

index(\.[a-z0-9]+)?

strikes me as overkill. You said at the beginning that your index files may have php OR html OR htm. That's three. If someone asks for a nonexistent extension, go ahead and dump a 404 on 'em. And if someone asks for "index" without extension, they are probably up to no good. So cut it back to

index\.(php|html?)$

And then, once you've canonicalized all your directory names, you can rename the physical index files so Apache doesn't have to waste time looking for three possible files every time there's a request for a directory. You might possibly need both php and htm(l) but you can definitely regularize to either html or htm throughout. The Apache default is html, so that will save the usual nano-micro-thingie.

as long as your server is receiving requests for index.html you will need a redirect to the canonical url.

If any of those requests are coming from the googlebot, you can condense that sentence to "forever" ;)
8:24 pm on Aug 18, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


Actually the code that g1smd recommended does work. I made a slight typo.
8:43 pm on Aug 18, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


This is what I have now and it seems to be doing the trick:

RewriteEngine on

# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(php|html?)$ http://www.example.com/$1? [R=301,L]

# Redirect non-canonical hostname to www
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]


Thanks for all the help and please let me know if there are any other recommendations on the rewrite codes above.
3:16 am on Aug 21, 2012 (gmt 0)

Junior Member

5+ Year Member

joined:July 17, 2006
posts:146
votes: 0


Now that I have changed all of my links, uploaded and redirected in htaccess, roughly how long does it take for these changes to show up in Google search results?
5:02 am on Aug 21, 2012 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


the last time i measured something like this google was dropping about 50% of the non-canonical urls from the index each month.
7:40 pm on Aug 21, 2012 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


You'll see one or two effects within weeks but I usually say three to six months, but it can be longer.
6:40 am on Jan 2, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 28, 2003
posts: 155
votes: 1


If I use the above code, but have
link rel="canonical" on http://www.example.com/categoryname/index.html within each directory, am I shooting myself in the foot?
8:30 am on Jan 2, 2013 (gmt 0)

Administrator

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Aug 10, 2004
posts:10542
votes: 8


what is the link element's href attribute value?
which url are you linking to internally?
it is technically better if you redirect requests for non-canonical urls using a 301 status code.
11:22 am on Jan 2, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Hopefully, you have

link rel="canonical" href="http://www.example.com/categoryname/"


but you should add the redirect.

Make sure that the site internally links to URLs without the index file name.
7:12 pm on Jan 2, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 28, 2003
posts: 155
votes: 1


Hopefully, you have

link rel="canonical" href="http://www.example.com/categoryname/"


Well, I confess what I currently have is:
link rel="canonical" href="http://www.example.com/categoryname/index.shtml"

with this in htaccess:
RewriteEngine on
RewriteCond %{HTTP_HOST} ^example\.com
RewriteRule ^(.*)$ http://www.example.com/$1 [R=permanent,L]

When linking to the home page of each directory within my site, I always use the full url with the index.shtml extension - with one exception, which is a breadcrumb script that automatically links to http://www.example.com/categoryname/

I think this has lead to some dilution of page importance in Google, but I'm not sure. I just know my 11 year old site is suffering badly after years of doing very well. Is dropping the index.shtml extension recommended for SEO?

If so, then I assume that I need to 1) rewrite all internal links to the new format and 2) change the htaccess to the code cited above. Are there any other steps I should take?
7:45 pm on Jan 2, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


Change all internal links to point to URLs without the index filename.

Change the rel="canonical" to point to URLs without the index filename.

Redirect requests for index URLs to the URL without the index filename.

Make sure the index redirect is listed before the non-www redirect.

Update your non-www redirect to use the more inclusive code mentioned a few posts back.
9:08 pm on Jan 2, 2013 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12704
votes: 244


Look at it this way: One side of your mouth is calling something canonical, and the other side of your mouth is deploying a 301 to say it's not canonical. Who you gonna believe?

Google will eventually get annoyed if it follows your own internal links and runs smash into a redirect. How annoyed it gets, and how that annoyance manifests itself, are questions for another thread. Or two, or a hundred ;)
4:25 pm on Jan 3, 2013 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 28, 2003
posts: 155
votes: 1


Thank you for the helpful responses. I've been busy. My site is static, and there's no easy way to make global changes, but here's what I've done:

- Changed as many of the internal directory links as I could find (navigation, past newsletters, etc)
- Changed the rel="canonical" on individual directory indexes to point to urls without the index.shtml extension
- Updated htaccess to this:

RewriteEngine on
# Redirect index in any directory to root of that directory
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /([^/]+/)*index\.[^\ ]*\ HTTP/
RewriteRule ^(([^/]+/)*)index\.(shtml|html?)$ http://www.example.com/$1? [R=301,L]
# Redirect non-canonical hostname to www
RewriteCond %{HTTP_HOST} !^(www\.example\.com)?$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]

Should I also put in individual 301 redirects for each category index?

I also regenerated my sitemap and pinged all the major search engines.

The bad news is that my traffic from Google has plummeted overnight, down by about 40%, which is going to cost me a fortune in lost revenue. Any insight as to whether this might improve and, if so, how long it might take?
5:04 pm on Jan 3, 2013 (gmt 0)

Senior Member

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:July 3, 2002
posts:18903
votes: 0


I would have added the redirect some time after fixing the canonical tag and fixing the navigation.

Hopefully, you are now pointing to URLs ending with a slash.

Use the Live HTTP Headers extension for Firefox to check the responses for a few canonical and non-canonical URLs.

Run Xenu LinkSleuth over the site and carefully check the reports.

(shtml|html?)
simplifies to
s?html?


Add a blank line after each RewriteRule for clarity.

Should I also put in individual 301 redirects for each category index?

That's what the
([^/]+/)*
and
(([^/]+/)*)
bits are for.