Forum Moderators: Robert Charlton & goodroi
Allow me to illustrate the problem (possible bug?).
Lets take a look at two data centers: [64.233.161.104...] and [72.14.207.104...]
For example for query site:nytimes.com
[64.233.161.104...] shows 29.700.000 results
While [72.14.207.104...] shows 36.000.000 results
I have tested several other sites and could see more or less the same behavior.
What could be the reason for such strange behavior of site: operator?
Thanks.
[edited by: tedster at 3:41 pm (utc) on April 2, 2008]
[edit reason] fixed link [/edit]
Of course the other possibility is that the "site:" operator is functioning well, but the two data centers I mentioned contains different volume of data.
As such we might expect [72.14.207.104...] to contain around 20% more data than [64.233.161.104...] .
Having said that, I'm aware of what Matt Cutts wrote once in 2006 [mattcutts.com]:
In the middle of that session, I talked about the frustration that modern data center watchers will encounter these days (because there are often slightly different things at different places) and I mentioned a slide from Boston Pubcon......Can you imagine trying to monitor that, especially when the same IP address can query different data centers for different people? It wouldn’t be my preferred hobby.
[edited by: reseller at 3:51 pm (utc) on April 2, 2008]
64.233.161.104 - 12.27 billion
72.14.207.104 - 12.63 billion
In other words, they're just about the same size. I think BillyS has a good idea when he mentions "tweaking their...estimating logic for the site: command." With the current "flux" in Google, many webmasters have commented that thoe estimates. which had improved, have recently become less accurate.
But that leaves us with the thought; which of the two DCs the folks at the plex are doing the tweaking on? because I can't imagine they are tweaking all over the place. I say [72.14.207.104...] in that case.
However, we had witnessed high site: results problem before. And I wish to recall another interesting 2006 post [mattcutts.com] of Matt Cutts, were he mentioned the high site: results estimates
- high site: results estimates. I believe that more accurate site: results estimates are live everywhere now.
I'm also from the camp that these small observations - especially on such a relatively obscure query (one used a lot by webmasters....) sometimes are fallout from larger changes behind the scenes. In other words, an unintentional change occuring from an intentional change. I also think this is why Matt is interested in these observations.
I'm also from the camp that these small observations - especially on such a relatively obscure query (one used a lot by webmasters....) sometimes are fallout from larger changes behind the scenes. In other words, an unintentional change occuring from an intentional change. I also think this is why Matt is interested in these observations.
Agreed. Power to you!
I'm beginning to think of a software update (infrastructure update) similar to BigDaddy might have been taking place during the last two weeks or so.