Welcome to WebmasterWorld Guest from 3.84.130.252

Forum Moderators: Ocean10000 & phranque

Message Too Old, No Replies

Redirecting index.html to root directory

     
2:31 pm on Nov 12, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


I have a site that has domain.com and domain.com/index.html indexed in SEs. I recently changed the index.html to index.php. I only want people to come to the home page through domain.com, not index.php (or index.html).

I tried to redirect index.html to domain.com, but I get some kind of loop. How do I do this? Should I use mod rewrite somehow? Or redirects?

2:44 pm on Nov 12, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


See this thread [webmasterworld.com] to get started.

Jim

3:12 pm on Nov 12, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


Thanks for the fast response, Jim! Precisely what I was looking for.

4

4:35 pm on Nov 14, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 17, 2004
posts:117
votes: 0


4string how'd it go? I have been looking into this also as I see index.html and / in my logs.

There are a couple of different methods I've found including the one jdmorgan points to above for the index.html to /. I have a question and if someone could elaborate I'd appreciate it.

From the link above-
--------------
1)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ /$1 [R=301,L]

I'm using this as it seems to work for me so far
--------------
2)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ $1 [R=301,L]

I tried this first, but it caused index.html to be not found instead of redirecting
--------------

From another thread on WebmasterWorld
--------------
3)
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html?
RewriteRule ^index\.html?$ [yoursite.com...] [R=301,L]
I haven't tried as the first one above appears to be working
--------------

Question:
Is #1 the best option and are there any issues/potential issues I should be aware of?

Thanks.

5:24 pm on Nov 14, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


For best results across many server configurations, you should specify a protocol and a canonical address in the rewriterule:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /.*index\.html\ HTTP/
RewriteRule ^(.*)index\.html$ [b]http://www.example.com[/b]/$1 [R=301,L]

Also, this code is intended to redirect "index.html" in any directory or subdirectory to "/" in that same (sub)directory. If you do not have Web-acessible index pages in subdirectories then the following, which redirects only your site home page, is more efficient:

RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://www.example.com/ [R=301,L]

Jim
6:26 pm on Nov 14, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


I had some trouble, but Jim's example here is working fine.

Thank you Jim!

8:53 pm on Nov 14, 2005 (gmt 0)

Junior Member

10+ Year Member

joined:Mar 17, 2004
posts:117
votes: 0


Thanks jdMorgan. I switched just in case and it works great!

Glad to hear you got it working 4string.

4:40 am on Nov 20, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


Well, this works for humans, but bots have been getting 301s when hitting www.example.com/ now. Any ideas why?

It was a bit disconcerting so I removed it for now. I probably won't have time to mess with it until after the US holidays. I really wanted this to work. Maybe I should just rewrite index.php to index.html and live with there being an index.html out there. Does google, et al. treat www.example.com/ and www.example.com/index.html as 2 separate pages?

6:21 am on Nov 20, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Nov 1, 2002
posts:1834
votes: 0


This - [R=301,L] - creates the 301 redirect. This is exactly what you want the bots to see. You want them to know that the index.html has been permanently moved to the root, which will prevent you from having what appears to be duplicate content.

WBF

2:48 pm on Nov 20, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


Thanks for your response.

I haven't gotten any bots except yahoo past the root for the domain since I implemented this. So, if I see a 301 code for / the bot had requested index.html? If I request index.html I see a 301 for index.html and a 200 for /. It seems like that's what the bot should get, too.

Sometimes I'm seeing 301s for robots.txt, too. I don't think the site/host is unreliable. Also, MSNbot only hits robots and goes away though it is not excluded (or mentioned) in my robots.txt.

5:17 pm on Nov 20, 2005 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
posts:25430
votes: 0


The code doesn't care whether it's a 'bot or "human" visitor. The behaviour you are seeing might well be due to the 'bot requesting example.com/index.html, and being redirected to www.example.com/
Unless you've already taken care of this domain canonicalization problem and are sure this isn't likely to happen, then that is a possible cause of 'not getting past the home page' as you describe.

It is entirely up to the client 'bot or browser as to how it wants to respond to a server 301 response. Note the terms -- server and client -- because it is the client who is 'in charge' and the server does the client's bidding. So, it is up to the 'bot to decide if it wants to follow the 301 now, or put it in a queue for later processing. Without seeing raw logs or knowing whether you've got canonicalization issues, it's hard to tell what's happening with your site. All I can really say is that if a client requests index.html from any domain that resolves to this .htaccess file's directory, it will get a 301 to "www.example.com/" and any issues outside of that need to be handled separately.

To rule out canonical domain redirects as a cause of spider 'pauses', you could try modifying the code (from my second example, here) so that it won't ever change the requested domain:


RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/
RewriteRule ^index\.html$ http://%{HTTP_HOST}/ [R=301,L]

This will issue the home page redirect using whatever domain is in the client's hostname header. If it changes the 'bots behaviour, then it's likely the 'bot was asking for example.com/index.html and was being redirected to www.example.com/ and therefore indicates that you have a canonicalization problem which needs to be addressed. This can be confirmed by looking at the SERPs; If your 'site' has pages listed under both www and non-www domains, then you've got a problem.

Jim

5:35 pm on Nov 20, 2005 (gmt 0)

Full Member

10+ Year Member

joined:Mar 28, 2004
posts:224
votes: 0


Ok. That clarifies it for me. I'll put it back up the way it was and watch a little more. I do have the www issue solved. Thanks for clearing all that up. I guess I'm just paranoid.

I still can't figure out what's wrong with MSNbot. Wrong forum for that though.

Thanks Jim and Willy!

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members