Page is a not externally linkable
LouMinatti - 6:30 pm on Mar 28, 2005 (gmt 0)
We have a couple hundred pages. Our homepage file is called default.htm. On a handful of internal pages, links back to the homepage were mistakenly called default.html or worse yet, index.htm. When I discovered we got hit, I spun my wheels for weeks before figuring this out. I did a search, and sure enough Google had crawled the pages with the incorrect links and had indexed the phantom, non-existent pages as "real", all with identical content to our homepage. I cannot prove this was the problem, but once I marked those phantom pages as "do not index", we were back in Google.
geekay, I believe that Google will issue a duplicate
content penalty for this. Here is what happened to us: