Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

phantom url syndrome

lots of phantom urls in the supplemental index

         

proboscis

10:08 pm on Jul 11, 2006 (gmt 0)

10+ Year Member



Hi,

My site has been slowly losing traffic...so I have been looking all around and I found out that I have around 8000 urls in the supplemental index that all point back to the same handful of urls on my site. I am afraid that it looks like I have more duplicate than real content, my real urls are about 1000.

The extra urls never existed. I will try to explain, here is an example of some real urls on my site:

example.com/sitemap.shtml
example.com/tos.shtml
example.com/literature/

Then Googlebot goes along and combines the real urls in crazy ways to make up the phantom urls, like this:

example.com/sitemap.shtml/tos.shtml
example.com/sitemap.shtml/literature/
example.com/sitemap.shtml/literature/literature/tos.shtml

And it gets just nuts, some of the phantom urls actually look something like this:

example.com/sitemap.shtml/literature/literature/literature/literature/literature/literature/literature/tos.shtml

The phantom example above (if it were real) would resolve and would look just like:

example.com/sitemap.shtml -but with broken graphics, so a duplicate.

When I look at my stats I can list "all urls" and my real urls are listed first because they get more traffic then when I get down to the phantom urls I can see that they make up about 70 - 80% of my total urls.

So far I have ran some link checks, and searched for other sites linking to me incorrectly, I didn't find anything. I also checked Yahoo, they don't have the phantoms, and I wrote to Google on Sunday -haven't heard back yet.

Is it Googlebot, something I have done, or my server...?

What should I do?

Thanks for reading :)

tedster

2:10 am on Jul 12, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



How does your server handle these "phantom urls" -- or any bad url at all? First thing I would check is to make sure that the server actully returns a 404 in the http header if an incorrect url is requested. In some cases, so-called "custom 404" handling actually returns a "200 OK" header or "302 temporary redirect" -- and that can mean trouble with Google.

There is a Firefox extension called Live HTTP Headers that will allow you to verify what your server is actually telling Googlebot about these urls.

jdMorgan

2:51 am on Jul 12, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Also, this looks like it could have been caused by a URL redirect gone bad -- recursively redirecting to the 'literature' subdirectory of the current (sub)directory, and limited by Google or by RewriteOptions MaxRedirects=7.

example.com/sitemap.shtml/literature/literature/literature/literature/literature/literature/literature/tos.shtml

Jim

proboscis

3:11 am on Jul 12, 2006 (gmt 0)

10+ Year Member



Hello,

The phantom urls all resolve. I checked the headers on some and it's a 200.

example.com/sitemap.shtml
example.com/sitemap.shtml/tos.shtml
example.com/sitemap.shtml/literature/tos.shtml

All of the above go to the same page and all return a 200.

a URL redirect gone bad? How do I find out what that is?

Thank you :)

Oh, I just tried this:

example.com/sitemap.shtml/asldkhfasjgasjaskjg

Just a real url followed by a / and random letters and that works too and is also a 200...