homepage Welcome to WebmasterWorld Guest from 54.237.78.165
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
"index.htm" and "index.html"
Can they coexist?
Bobby

10+ Year Member



 
Msg#: 27789 posted 11:16 pm on Feb 1, 2005 (gmt 0)

Just curious if anyone's had any experience here with this one.

I changed the index page and put it up as index.html where it used to be index.htm.

Previously there had never been an index.html

The server defaults to index.html, but I'm wondering if I can keep the old index.htm up there anyway.

Would it be treated as simply another page?

 

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 6:18 pm on Feb 2, 2005 (gmt 0)

I'm afraid not. Links to /index.htm and /index.html are treated as links to /

index.shtml, index.pl, etc. are treated literally.

You could have index.htm and InDeX.HtM but people would think you're just trying to look kewl. :-)

Bobby

10+ Year Member



 
Msg#: 27789 posted 9:18 pm on Feb 2, 2005 (gmt 0)

So then the old index.htm passed its PR to the new one right?

I've got a link from the new one back to the old one hoping that the spider will follow the links that were up on the old one, do you think it will?

Macro

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 27789 posted 9:21 pm on Feb 2, 2005 (gmt 0)

My default page is default.htm. mysite.com/ shows up as a PR6. I also have an index.htm and an index.html and they both show up as 6. Strangely, if I go to mysite.com/default.htm it shows up as a 0.

BigDave

WebmasterWorld Senior Member bigdave us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 10:02 pm on Feb 2, 2005 (gmt 0)

You should never have a link to an index.html or index.htm file. You should always link to the directory to avoid any problems.

example.com/widgets/ GOOD
example.com/widgets/index.htm BAD

Bobby

10+ Year Member



 
Msg#: 27789 posted 7:09 am on Feb 3, 2005 (gmt 0)

I also have an index.htm and an index.html and they both show up as 6

Macro, do the 2 pages have different content and are then both indexed differently by Google?

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 9:27 am on Feb 3, 2005 (gmt 0)

> So then the old index.htm passed its PR to the new one right?

You could see it like that.

The easiest way to think about this is that Googlebot is seeing these:
* <A href="http://www.example.com/index.html">
* <A href="http://www.example.com/index.html">
* <A href="http://www.example.com">

and treating them all as <A href="http://www.example.com"> before following the links or assigning PageRank.

> should never have a link to an index.html or index.htm file

I agree with that but as far as Google is concerned now, they just count as links to / so it doesn't matter. It certainly can matter in non-Google contexts.

Macro

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 27789 posted 10:04 am on Feb 3, 2005 (gmt 0)

Bobby, the site is #1 for the keyword on the default.htm page. However, searching for text exclusive to index.html doesn't show that page in SERPS.

Ouch, one more thing for the link exchangers and link buyers to watch out for!

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 1:36 pm on Feb 3, 2005 (gmt 0)

Macro, /default.htm is not merged with / - as far as I know only /index.htm and index.html are.

So if /default.htm and / both have links to them it is normal for them to be listed separately, with different PageRank and backlinks. If Google then identifies them as being 100% duplicates they should be merged into one listing, which inherits the PR and backlinks of both. Note that Googlebot may visit the two different URLs at different times, so if there are frequent changes (such as a 'what's new' column or an automatically generated date) then they will not merge if the robot finds slightly different content.

BigDave

WebmasterWorld Senior Member bigdave us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 6:25 pm on Feb 3, 2005 (gmt 0)

I have noticed that Google will also merge index.php, but since php is dynamic by definition, they seem to take MUCH longer, and it has to be a very stable page with several crawls between updates.

robster124

10+ Year Member



 
Msg#: 27789 posted 8:41 pm on Feb 3, 2005 (gmt 0)

I have a site PR 5. Nowhere in the site (or external to the site) are there links to /default.asp - but it has a PR 5.

I think you'll find Google can recognise an index page(i.e. index.php .asp, default.htm etc) and will just assign the same PR to all these pages that display homepage content. Perhaps I'm wrong...

Macro

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 27789 posted 8:55 pm on Feb 3, 2005 (gmt 0)

My index.htm and index.html had different PRs last year. (site was defaulting to default.htm)

robster124

10+ Year Member



 
Msg#: 27789 posted 9:31 pm on Feb 3, 2005 (gmt 0)

Google is a strange beast

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 11:12 am on Feb 4, 2005 (gmt 0)

Google will merge index.php and default.asp and foo.bar if it finds the same content. This is a different process from not crawling index.html and index.htm

> index.htm and index.html had different PRs last year

That was before last September though? I don't know when Google stopped crawling those URLs, but it was before 1 September 2004.

BigDave

WebmasterWorld Senior Member bigdave us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 5:03 pm on Feb 4, 2005 (gmt 0)

Really? They don't even crawl */index.html or */index.htm anymore and just crawl */?!?!?

That's just stupid!

I can think of at least half a dozen ways to take advantage of this and be able to claim innocence. Hell, I could use both index.htm and index.html to double the effect.

Bobby

10+ Year Member



 
Msg#: 27789 posted 7:18 am on Feb 5, 2005 (gmt 0)

I'm keeping an eye on how Google reacts to the two files (index.html and index.htm), apparently it only recognizes the default one.

My real concern is whether or not the old pages which were linked to from index.htm will disappear from the SERPs as there are no links to those pages from the new index page (new structure and new pages).

At this point in time they are still present in the SERPs but that may be because as far as I know Google will not eliminate the files any time soon, which brings us to the question of how long does it take for an indexed file to disappear from the SERPs assuming there are no more links to the page.

You should never have a link to an index.html or index.htm file. You should always link to the directory to avoid any problems.

I don't know about you BigDave but when I develop a site I link to the index page from all other pages. On a local level it doesn't work to link to the folder as the page will not appear, only the folder. If I link to another site on the web I will link to the folder.

Macro

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 27789 posted 10:29 am on Feb 5, 2005 (gmt 0)

BigDave, as I said:
Ouch, one more thing for the link exchangers and link buyers to watch out for!

I'm sure there are other pitfalls to this behaviour.

That was before last September though?

It was. I can't remember when I last saw different PRs but it was at least a year ago. Because they are showing up with the same TPR now I have no way of knowing.... wait! I do. There are no IBL to index.htm and index.html and they still went from PR5 to PR6. Hmmm.

ciml

WebmasterWorld Senior Member ciml us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 27789 posted 10:35 am on Feb 5, 2005 (gmt 0)

> That's just stupid!

I think I said something similar a while back, but not quite so succinctly. Remember the "using the removal tool to take out someone else's homepage" thing that was partly fixed some time ago? 1 + 1 = ...

> how Google reacts to the two files

Bobby, Google doesn't react to files on your webserver, but to URLs (and sometimes inappropriate assumptions of the underlying files). I know this seems obvious, but the difference between internal files and external URLs is crucial to this topic.

> whether or not the old pages which were linked to from index.htm will disappear from the SERPs

As long as / returns the same content as /index.htm, it doesn't matter. Where it is different, you can have a problem.

newcomer

10+ Year Member



 
Msg#: 27789 posted 11:53 pm on Feb 5, 2005 (gmt 0)

Hello Everyone,

I have a quesiton that as of yet, I have no answer on how to fix the problem.

This topic was about duplicate content htm and html. My question is for the last week or so, Google has in their index 2 of each page on my site. First with wwwdotmysitedotcom and again as just mysitedotcom

It does not appear that my PR was split, but I do know that my wwwdotmysitedotcom comes up on certain keywords and the other just mysitedotcom comes up on other keywords.

I have always had 1 site. My site is over 3 years. I have always submitted my site as wwwdotmysitedotcom

I had e-mailed Google a few days ago explaining what I have found with no response yet.

Does anyone know how I can fix this problem?

Look forward to all advise..

Thanks

Newcomer

sailorjwd

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 27789 posted 1:22 pm on Feb 6, 2005 (gmt 0)

Newcomer,

I have close to same problem and my site has nearly disappeared from serp results as of feb 2. I'm looking for assistance too.

I have

mysite.com/index.htm
www.mysite.com/index.htm (supplemental)

I used the removal tool to try to get rid of all occurances of index.htm.
Now I am left with only the supplemental listing and can't seem to get rid of it. Emails to google have been ignored.

Anyone have any suggestions?

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved