Forum Moderators: Robert Charlton & goodroi
For the last few days, on the "experimental" DC, the erroneous search that previously returned 900 vague supplemental results (that didn't match the search query) instead of just a few dozen relevant supplemental results (for deleted pages and expired domains), occasionally returned zero results - which is the correct result if Google ever cleaned up the old supplementals - for a phone number that has been completely removed from the web during the last few years.
Today, many DCs return zero results every time for this and several other similar queries for stuff that Google should have cleaned up long ago.
Now, is this a DC that has been cleaned up of old Supplemental Results, or is it a DC that has the Supplemental data missing and Google is going to add it back in again, in the next few days?
Time will tell.
.
A large website whose domain expired two weeks ago, had 12 000 pages listed, many of them supplemental for the last few years. The root shows a "domain expired" message. All other pages are gone from the site.
Google reindexed the site and overnight the number of listed pages has been reduced to under 100 on the "experimental" DC. It seems like Google is aggressively throwing away old data, whereas before they would have held on to it for years and years...
On the old "normal" DCs, Google still shows 12 000 pages listed.
.
Is this the "big cleanup" started, or just some glitch?
64.233.161.107, 64.233.161.147, 64.233.179.99, 64.233.179.104, 216.239.37.99, 216.239.39.99, 64.233.161.99, 64.233.161.104, 216.239.39.104, 216.239.37.104, 64.233.179.107, 72.14.207.107, 216.239.37.107, 216.239.39.107
And streamed down results that find ONLY my page here:
64.233.171.99, 64.233.171.104, 64.233.171.107, 64.233.171.147, 64.233.185.99, 64.233.185.104, 64.233.187.99. 64.233.187.104, 64.233.189.104, 216.239.53.99, 216.239.57.99, 66.102.7.99, 72.14.203.99, 72.14.203.104, 72.14.203.107, 72.14.207.99, 72.14.207.104, 72.14.207.107, 64.233.167.99, 64.233.167.104, 216.239.63.104, 216.239.53.104, 66.102.7.104, 216.239.57.104, 216.239.53.107, 216.239.57.107, 16.239.57.147, 66.102.7.147, 64.233.167.147
These appear to show in-between results:
64.233.183.99, 64.233.183.104, 216.239.59.99, 66.102.11.99, 66.102.9.99, 216.239.59.104, 66.102.11.104, 66.102.9.104, 64.233.183.107, 66.102.9.107, 216.239.59.107, 216.239.59.147
I don't think Google's broken -- I think this is an intentional purge. I'm a little bummed about it because they are crawling a lot of my pages and then withholding them from results for reasons unknown to me (they are especially withholding newer pages with unique content). I guess all I can do is try to improve my site and hope for incremental improvements over time.
However I do see things like 3500 pages in BD and 30 in "experimental" DC. The "experimental" DC now has two versions, and one seems to have dropped all Supplemental Results from before 2005 June.
A search for another of my sites revealed:
"Your search - site:myotherdomain.com - did not match any documents."
However upon checking my logs I am getting Google referrals for the site. Checking Google, the site appears in the SERPs for certain searches but only returns Supplemental results (all pages are supplemental for this site!).
So as far as I can see the site: command does not include supps, only pages from the main index.
Nice find g1smd - so, we have confirmation that the "site:" search does not work like (it should, or) we would like it to.
So what we have here are -- apparently -- some (supplemental) pages that are not being counted as part of the site on which they are hosted... In other words, the link between page and site has been broken.
It is not uncommon for a "site:" search to occasionaly return "No results". I know because I am monitoring this stuff quite carefuly at the moment. Bear in mind that a datacentre is actually a large cluster of machines. When a datacentre is in flux, you will get wildly different results from it from one second to the next as your requests are handled by the different machines in the cluster. This will happen for a short period and then settle down. During this period, you will often get "no results" for a "site:" search from some of the machines in the cluster.