Now there's an odd one -- no, I have never seen that. But there's lots of oddities about in the past few weeks.
I've been noticing google indexing pages that don't exist a lot recently.
This is all part of their effort to inflate their index size to aid in the "mine's bigger" contest they're engaged in with yahoo. (also known as being a publically traded company and caring more about stock price than quality of results)
I also have google showing pages in the index that have never existed, as well as a slew of pages that have been removed and 301'd for 6-12 months.
Anyone have any idea why Google would be showing backlinks for index.html when I have never had an index.html - and all of the pages they show link to that are NOT, they are linking to www.mydomain.com?
wanted to repost to bump, mods didn't activate thread for a couple days.
What does your server respond when someone requests index.html? I can see how any spider might be programmed to guess at the most common index page name, so how the server responds would be key. For example, I've seen "custom 404" pages that were actually set to serve a 200 header.
Google shows this behaviour for years - "index.html" is merged with "/".
no custom 404, just a plain old 404
Server Response: [mysite.com...]
HTTP Status Code: HTTP/1.1 404 Not Found
Date: Wed, 05 Oct 2005 18:39:31 GMT
Server: Apache/2.0.46 (Red Hat)
I use index.shtml and always have - site is 4+ years old
For the past year google has index www and non www versions of almost every page, they show about twice as many pages indexed as I actually have.
don't you think it's a bit strange that they take the liberty of merging index.html with / but they can't figure out that www and non www pages that are indentical should be merged?
If my employees did something this stupid, I would fire them!
Indeed sometime Google's behaviour is strange. However, I've also seen several cases were they merged www and non-www pages.
I've noticed a strange G cache lately for my site as well. Site is over 2 years old, and recently after multiple 302 redirect hijacking, I have noticed that G is showing a cache of:
Now my main index page is not showing in the index of course, and there is no G cache. Should be in there:
Anyone ever seen this kind of activity? Suggestions?
Suggest ... use MSN instead and tell your friends to do same