Forum Moderators: Robert Charlton & goodroi
My site has been slowly losing traffic...so I have been looking all around and I found out that I have around 8000 urls in the supplemental index that all point back to the same handful of urls on my site. I am afraid that it looks like I have more duplicate than real content, my real urls are about 1000.
The extra urls never existed. I will try to explain, here is an example of some real urls on my site:
example.com/sitemap.shtml
example.com/tos.shtml
example.com/literature/
Then Googlebot goes along and combines the real urls in crazy ways to make up the phantom urls, like this:
example.com/sitemap.shtml/tos.shtml
example.com/sitemap.shtml/literature/
example.com/sitemap.shtml/literature/literature/tos.shtml
And it gets just nuts, some of the phantom urls actually look something like this:
example.com/sitemap.shtml/literature/literature/literature/literature/literature/literature/literature/tos.shtml
The phantom example above (if it were real) would resolve and would look just like:
example.com/sitemap.shtml -but with broken graphics, so a duplicate.
When I look at my stats I can list "all urls" and my real urls are listed first because they get more traffic then when I get down to the phantom urls I can see that they make up about 70 - 80% of my total urls.
So far I have ran some link checks, and searched for other sites linking to me incorrectly, I didn't find anything. I also checked Yahoo, they don't have the phantoms, and I wrote to Google on Sunday -haven't heard back yet.
Is it Googlebot, something I have done, or my server...?
What should I do?
Thanks for reading :)
There is a Firefox extension called Live HTTP Headers that will allow you to verify what your server is actually telling Googlebot about these urls.
example.com/sitemap.shtml/literature/literature/literature/literature/literature/literature/literature/tos.shtml
Jim
The phantom urls all resolve. I checked the headers on some and it's a 200.
example.com/sitemap.shtml
example.com/sitemap.shtml/tos.shtml
example.com/sitemap.shtml/literature/tos.shtml
All of the above go to the same page and all return a 200.
a URL redirect gone bad? How do I find out what that is?
Thank you :)
Oh, I just tried this:
example.com/sitemap.shtml/asldkhfasjgasjaskjg
Just a real url followed by a / and random letters and that works too and is also a 200...