Forum Moderators: Robert Charlton & goodroi
That is, usually www.example.com/ (note the slash) is the same page as www.example.com/index.html If you have it both ways, google could well think you have twice as many pages as you really do. After someone pointed this out to me a while back, I've been steadily removing explicit mentions of index.html and just giving directory links. I hope all five thousand of your pages aren't hand coded!
About the trailing backslash - a link to a directory properly ends in a slash. Your webserver will still serve the page if it's missing, but it will do so by redirecting the readers browser to the page with the slash. This puts a little heavier load on your server, and reduces response times. So always put the trailing slash in your directory links.
After someone pointed this out to me a while back, I've been steadily removing explicit mentions of index.html and just giving directory links.
Will that accomplish anything if even one outsider links to index.html on your site (which may well be the case if you receive organic links and a linking party simply copies the page URL into his HTML editor)?
I get organic links all the time (mostly from obscure sources), and there's no way to even keep track of all such links, let alone police the URLs that third-party Webmasters may be using.
If you keep clicking through the pages shown it does stop at roughly the right number.
I was worried about duplicate content etc when I first saw this some weeks ago.
My guess now is that it is, hopefully, just a page count problem similar that one MSN.
Will that accomplish anything if even one outsider links to index.html on your site (which may well be the case if you receive organic links and a linking party simply copies the page URL into his HTML editor)?
You can't stop someone from saying "index.html" in his own link, but you can avoid intentionally showing him index.html in your own links.
A while after I had removed most of them from my site, I stopped getting much traffic to the index.html urls, and just got directory hits. I think what will happen is that new links that others place will have your proper directory link, but new links to index.html won't appear much, so index.html will fade in search engine rank as the directories increase.
I think google in the the start counted my pages plus the hijacker copies, then 3-4 weeks later I was out of the serps, I hope thats the case with you its just one option.
You can't stop someone from saying "index.html" in his own link, but you can avoid intentionally showing him index.html in your own links.
What about the trailing slash in domain names? If you're using an absolute link for your home page in a navigation bar, does Google treat [mysite.com...] differently than [mysite.com?...]