|site: shows different totals depending on serp page?|
If I do a site:example.com at Google, it shows results 1-100 of 4,400.
But at the bottom of the page, there are only 5 pages of results.
If I go to page 5, it now tells me 401-415 of 415.
Usually this number shows 1,800-2,100 for this site. All the pages are very different and have long hand written relevant articles, and we have about 2,000 of these articles, + navigation pages basically. We've robots.txt'd out all the duplicate pages like pop-up image frames, comments, etc etc.
Why the discrepancy? It isn't the first time I've seen this happen.
If I try to "do the search with omitted results" it basically functions the same way, 5 pages, 400 odd results.
[edited by: tedster at 1:35 am (utc) on Mar. 26, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]
It's been like this for as long as I can remember - and it's one sign of the unusual way that Google handles their data. It's a major leap from your vanilla mySQL to what Google has created.
Note that when you see the big number of pages, actual message says "about 4,400 pages" but when the final, lower number gets displayed the word "about" is gone. Google takes their spidered raw data and they shard it across all kinds of tables and copies of those tables on different data centers.
They process each of these data sets in all kinds of ways, and end up with a situation where they cannot easily give you an exact number of urls. So on the early searches they give an estimate, only focusing in on the "real" number at the very end. Note that I place the word "real" in quotes. Let's say you have five directories and you do five site: operator searches, each one including a different directory: site:example.com/directory1/ through site:example.com/directory5/.
The total number of urls you get with that method can often be more than you see for a basic search on site:example.com
Last year Google made some changes in their data infrastructure, and one of the side effects was an improvement in the accuracy of the site: estimates. But more recently there has apparently been another change, because those estimates can again be out in left field, especially for smaller sites.
Hrmmm interesting! I did notice that when it went to 1,800 to 2,000, I was pleasantly surprised because that's about right on with where Google should be indexing us.
Also a bit more info, if I look at my WM Tools area in the sitemaps section, it says my sitemap has 1,600 or so URL's in it (which I think is wrong! but another story) and that we have about 1,200 indexed in google.
I wrote above that "It's been like this for as long as I can remember," but just in recent days it does seem to be more common and much stranger.
Maybe the big flux that we're talking about in the April SERP Changes thread [webmasterworld.com] is wreaking havoc with the Google's ability to estimate the number of search results.
|Last year Google made some changes in their data infrastructure, and one of the side effects was an improvement in the accuracy of the site: estimates. But more recently there has apparently been another change, because those estimates can again be out in left field, especially for smaller sites. |
I am seeing a (very) recent change in site:example.com searches. The numbers are way out in left field... and the "repeat search with omitted results" is gone on many searches.
I am wondering why Google would even say "about" on a site: search... You would think they would more easily be able to say exactly how many URLs they've indexed for a given domain.
Until recently I could pretty much count on Google to tell me exactly how many pages were indexed when I SEO a new site... now the numbers are confuzzling.
To speed up looking through the SERPS, I set my preferences to 100 listings per page. A search for site:example.com produces; Results 1 - 100 of about 1,400 from example.com. (0.06 seconds)
I get to the end (7 pages with less than 700 listings), and click the repeat Search with omitted results, (which DID appear at the end of the 7th page on this particular site: search) and now get: Results 1 - 100 of about 1,370 from example.com. (0.05 seconds)
...but can still only get Google to produce 619 results... Results 601 - 619 of about 1,370 from example.com. (0.15 seconds)
The weird things are;
1. intermittent appearance of the "omitted results" link even when G produces less than 1/2 the "about" number of listings.
2. less estimated pages with omitted results included, (it used to always show more).
3. inconsistent ability to retrieve the omitted results.
I DO NOT remember ever seeing this behavior before today.
One clue may be URL's with query strings... I just did another test uising the site:example.com operator -- this time on a site that has a few CGI based URLs... the omitted results produced (2) more listings -- both were example.com/cgi-bim/script.cgi?var=data type URLs... hmmmm...
Maybe that's it? Is Google removing dynamic URL from the main index?
For some sites, changing the parameters in a dynamic url can still get you the same content. I have seen Google group this kind of "duplicate url" behind the omitted results link.
Tedster, this may be due to April 2008 Google SERP Changes [webmasterworld.com] thread as you pointed out in sticky.
I tested a site I have been working on this month. Earlier tonight it showed "about" 240 results... only half were retrievable, no "repeat search with omitted results" appeared at the end of the last page. Two hours later, G reports "about" 87 results... all 87 are there.
1. Large sites with dynamic but similar URLs seeing some dropped?
2. Data not propagated to all data centers?
3. "Omitted" being relegated to supplemental, but supplementals no showing?
4. (Most likely) I HAVE NO IDEA... only GOOGLE KNOWS FOR SURE... and even they may not.