|Mismatch between link/snippit and cached page|
I've been having problems lately with some of my sites where the link and snippit in Google's results don't match the text of the cached page (<title> and meta description). Instead that link and snippit are getting pulled from one of my other sites sitting on the same server, but on a different domain and in another language.
So for example, I have a .es site and a .jp site on the same server, and all is well, .es site shows up for searches in Spanish on google.es, .jp site shows up for searches in Japanese on google.jp. And then one day suddenly my .es site starts showing up for searches in Japanese on google.jp with the Japanese homepage title and snippit. But the cached page is still the normal Spanish one! And at the same time, on google.es, I still show up for the same Spanish terms, but the snippit and link are in Japanese too!
The problem seems to last just 1 crawl, so after a few days I get crawled again and everything reverts to normal. It's happened 4 times now, each time with the same symptoms but a different random pair of sites (I have ~40 country sites with different languages and each has a local TLD). The cache always shows the 'correct' content, that is to say the content that should be showing on the domain, but the title and snippit don't match that. Does Google pull the cache from a different place than the link and snippit? Has anyone else ever seen this happen, and more importantly, how did you fix it? I am at a loose end trying to figure this one out, and my developers are no help since we have never been able to see the problem on the site itself; only in the google results (and never any other engine).
thanks so much if anyone has even a glimmer of an idea on what this could be, or ways for me to investigate.
Hello Kate, and welcome to the forums
|Does Google pull the cache from a different place than the link and snippit? |
Yes - what you're seeing is evidence of something we might already have assumed. With this much data on their hands, the "division of labor" across their huge server farm (700,000 boxes, I've seen as a recent guesstimate) is intense. The snippet team does not draw the data from the cache servers - they do their own thing off in the corner and that gets mixed back into the final SERPs. In fact, using automation to construct a useful snippet for the end-user is a critical part of their mission.
|Has anyone else ever seen this happen, and more importantly, how did you fix it? |
Yes, I have, but not cycling over and over. Often there was some DNS error, and fixing that fixed the bad title and snippet, never to go bad again. So what is your situation? Tough question - a cyclical DNS issue seems unlikely to me.
Thanks for your answer, but does the "snippit team" also handle the link text? I thought those were always pulled from the <title> tag. This example though seems like they are getting taken at the same time the snippit is made, ie not the same as the page cache.
Hm, I don't think there's a DNS problem, since all the domains live on the same server. I use a DB to associate a domain to a particular language/country's content (all the domains are CNAMEs of my primary .com).
One thing I have noticed is that the problem seems to appear on the same day as our site is having live releases (usually Thursday). It doesn't happen every Thursday of course, but the times I've had the mismatch thing between countries, the last crawl date from the cached page is a Thursday. Data sample is small, so maybe that's circumstantial though.
Is there any way to figure out where a snippit is coming from (like which data center, or which bot)? I'm casting for straws here.
Over time, I've seen the linked title on a search result coming from at least four sources, depending on the specific search that was entered:
1. The page's actual title element
2. The site's description at DMOZ
3. Anchor text from a backlink (now rare)
4. Just the url (with "noindexed" pages - also rare)
There is an algo that decides which one to match up with any particular query. Obviously choosing a clickable title is critical for end-user satisfaction so I'm sure it gets thoroughly studied, but I don't know whether title formation is handled by the snippet team or if that algo is managed by another team.
If you've got a DMOZ entry and do not want Google to use that as a title in their search results, you can use the NOODP meta-tag [webmasterworld.com] in your pages.