|Wrong Site in Google Cache / Info|
Searching for URL or info:URL brings up wrong site
| 4:06 pm on Mar 10, 2005 (gmt 0)|
My latest Google oddity: A site (let's call it example.com) has indexed for many years, and now Google appears to be confusing it with another site (which we will call othersite.tld).
Searching Google for info:example.com produces results entirely unrelated to example.com - more accurately, it brings up results about othersite.tld. The display is essentially:
Other Site's Title [hyperlinked to othersite.tld]
Other Site's Description
Google can show you the following information for this URL:
* Show Google's cache of example.com
* Find web pages that are similar to example.com
* Find web pages that link to example.com
* Find web pages that contain the term "example.com"
If you follow any of the "following information" links, the first three lead to information about othersite.tld . That is, the cache is of othersite.tld, the "find similar pages" link searches for "related:othersite.tld", and the "websites that link to" link searches for "link:[...]:othersite.tld".
Additionally, the cache of othersite.tld is a year old, whereas Google has spidered and has previously displayed much more recent cached versions of example.com .
What can cause Google to go so wildly wrong in its manner of indexing this site?
[edited by: ciml at 5:22 pm (utc) on Mar. 10, 2005]
[edit reason] Examplified [/edit]
| 7:13 pm on Mar 11, 2005 (gmt 0)|
Is the website on its own IP address or a shared one?
| 9:35 pm on Mar 11, 2005 (gmt 0)|
It's on its own IP. The two sites at issue have different hosts, different nameservers, different owners, differerent registrars, and... are even in different languages.
The problem appears to be in part because of something alluded to in another thread - a webhost's rather incredible decision to block googlebot - and the host claims that is now resolved, so perhaps things will go back to normal. But I am still scratching my head about how Google made such a wildly erroneous connection between the domain name of the first site and the URL of the second.
| 4:26 am on Mar 14, 2005 (gmt 0)|
I think I may understand what happened, although it is hard to believe Googlebot would make this mistake. My webhost caused my site to go down a couple of weeks ago, which caused any attempt to load the site redirect to an Ensim "rollout" page. It so happens that the site which Google is confusing with mine also redirects to an Ensim "rollout" page. So I'm thinking that Googlebot saw "identical content" and decided "so they must be the same site".
| 4:39 pm on Mar 14, 2005 (gmt 0)|
I have seen similar results before, although when I saw it, it was the result of a tracking url that redirected to the site google had confused. You're the first other case I've read about where Google purports to show you info about one site but on mouseover offers details of another site. I guess I could have missed people talking about it in one of the pagejacking threads, but I beleive it is just a very rare error at this point.
| 5:14 pm on Mar 14, 2005 (gmt 0)|
I have managed to find only one incident which seems to replicate the error, if not the cause, described at (of all places) Google Answers, just shy of three years ago. The answer, a not-so-comforting expression that it was likely that Google would correct the error in its next crawl. That answer appears to be from a Google staff member, as opposed to one of the independent contractors.
For now, I wait.
| 7:03 pm on Mar 14, 2005 (gmt 0)|
You seem to have such a clear-cut example of Google gone awry, it would be disappointing if GG didn't express an interest. If Google figures out how your particular oddity happened, it could give them insight into the cause of other oddities and possible preventative measures.
| 11:37 pm on Mar 17, 2005 (gmt 0)|
Two new developments, neither of which have yet affected the problem. I received a form letter response to a bug report I submitted to Google, which apologized for Google's inability to provide individualized feedback and which unfortunately did not contain any helpful information. (I don't hold this against Google, given the volume of feedback they have to plow through.)
The more promising development is that the owner of the site which is being confused for mine is trying to manually remove that site from Google, and has put up a new index page at the URL in an effort to convince Google that we in fact do operate very different sites. (As I mentioned previously, the other site has been redirecting to an Ensim "new site" page.)
| 12:41 pm on Mar 22, 2005 (gmt 0)|
Google is now properly associating www.example.com with the actual site, but remains confused about example.com. Still, progress is progress.
| 8:44 pm on Mar 22, 2005 (gmt 0)|
I think the answer is to set up a 301 redirect from non-www to www on your site, if you haven't already done so.
| 2:39 pm on Mar 23, 2005 (gmt 0)|
Thanks for the suggestion. It is my perhaps naive belief, though, that if Googlebot revisits [example.com...] it should update its cache and info even if there is no redirect, and if it isn't revisiting example.com it won't find the redirect. It is also my impression that Googlebot treats 301 redirects with some degree of suspicion, and SERPS appear to sometimes continue to include redirected URL's without description, while possibly also applying a "duplicate content" penalty to the new page. I won't rule anything out if this doesn't resolve, but given the gradual improvement I am willing to wait a bit longer before trying to force Googlebot's hand.