Welcome to WebmasterWorld Guest from 50.16.24.12

New "site:" search rolling out

Google have learnt to count beyond 1000!

   
1:44 am on Jul 8, 2006 (gmt 0)

5+ Year Member



It looks like Google are rolling out a new version of the "site:www.yourdomain.com" search.

The new version finally seems to report accurate-ish numbers for page counts beyond 1000, instead of the usual order-of-magnitude exageration.

The new version is currently available on several DCs, but seems to be rolling out as I write, e.g:

72.14.203.99
64.233.167.99
64.233.167.104

12:35 pm on Jul 8, 2006 (gmt 0)

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I just checked a few sites I'm familiar with (way too familiar, LOL) and still see the old bogus numbers. I'm glad you see a change, though. It's hopeful.

[edited by: tedster at 12:36 am (utc) on July 10, 2006]

4:44 pm on Jul 8, 2006 (gmt 0)

5+ Year Member



It's strange, the new more accurate results only seem to show for certain sites.

Take a look at site:www.cnn.com on 72.14.207.99 for an example of the new data.

8:44 pm on Jul 8, 2006 (gmt 0)



Good catch. I just tried that data center and got credible numbers for my two sites. (4,760 for my main site, compared to Google.com's estimate of 24,600 pages.)
9:19 pm on Jul 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"... certain sites", I think is correct.

Whoops!
This is a page count discussion, please excuse me.

9:42 pm on Jul 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



That DC is even further out of line for ours, maybe in e a couple of centuries the 'plex's AI will have learned how to count.
10:27 pm on Jul 8, 2006 (gmt 0)

10+ Year Member



Those DC's are showing 2.9 Million pages for my site and trust me when I tell you that I do not have anywhere close to that many pages.
11:27 pm on Jul 8, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The site I have been watching is even worse than ever today, on that DC and my local one.
9:24 pm on Jul 9, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Accross the 620 known IP addresses for Google, there are at least a dozen different site counts for every site out there.

The differences in reported page numbers are huge.

9:39 pm on Jul 9, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



The 72.14.207.99 shows a correct page count for the ODP (site:dmoz.org) for the first time in many years. All the pages have been non-www for years. The count for non-www had been massively inflated for quite a while.

Interestingly, it now also shows a few www pages, something it has not done since the days of the 302 redirect hijacks (when I mentioned on WebmasterWorld that site:www.dmoz.org returned 30 million pages none of which were on www.dmoz.org itself, that SERP went to zero results within a few hours and had been there ever since. The "302 bug" was not fixed for other domains that I was watching and which I did not mention here.)

[edited by: g1smd at 9:40 pm (utc) on July 9, 2006]

9:52 pm on Jul 9, 2006 (gmt 0)

WebmasterWorld Senior Member ken_b is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Same old bogus numbers for me. Not even close to reality.
10:40 pm on Jul 9, 2006 (gmt 0)

5+ Year Member



Those (bogus) numbers are single largest factor in determining how much traffic Google sends me.
10:53 pm on Jul 9, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



##Apologies for being patronising, I'm just mad at this.##

I can't believe that people do not understand the correlation between inflated results and how Google views your site. It's been covered so many times by various people, you just have to read between the lines.

Here's the explanation:

When Google estimates (yes, estimates) the number of pages of yours that it has in its index it does not count them all. It just counts how many are in the top 'x' results (given that pages are stored by overall 'value' - maybe some variant of PR - in a rather large database). IF your site is well-ranked then more of your pages will be in the 'sample' dataset and hence google will overestimate your total page numbers.

It would be foolhardy to count every page when looking up a site: search. This also explains why numbers generally get more accurate as you page through results , page 1 deals with less 'sample' data than page 10.

My guess is that most people who even know how to check things on Google will be above average on rankings, therefore the general opinion of the webmaster 'in the know' is that Google inflates page numbers. However, if your site is poorly ranked (very poorly ranked) then the number of pages returned for a site: query will be lower (as less pages will show up in the first 'x' results).

I could be wrong, but I'd stake a fair amount on this being at least a part of the way to explaining why site: counts are 'wrong'.

11:11 pm on Jul 9, 2006 (gmt 0)

5+ Year Member



Inbound,

Not sure if that was directed at me, but your post sounds reasonable to me.

4:51 am on Jul 10, 2006 (gmt 0)



inbound, I might buy your argument except for the fact that the vastly inflated numbers are a fairly recent phenomenon, unless I'm mistaken. I never encountered them before the middle of last year (it might have been even later than that), and I don't recall hearing complaints about inflated page numbers before then.

[edited by: europeforvisitors at 4:53 am (utc) on July 10, 2006]

9:28 am on Jul 10, 2006 (gmt 0)

WebmasterWorld Senior Member steveb is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Cmon, I can't even imagine how or why they would "estimate". How hard is to count an actual number of URLs with the same domain address?

(No improvement on counting pages that I can see.)

[edited by: steveb at 9:29 am (utc) on July 10, 2006]

9:43 am on Jul 10, 2006 (gmt 0)

10+ Year Member



" Cmon, I can't even imagine how or why they would "estimate". How hard is to count an actual number of URLs with the same domain address?"

Got a kick out of that statement! It has been a couple of years since I've seen a remotely accurate page count out of big G. I guess it is pretty hard either that the PHD's had to dump some of their basic education to make room for their Google egos.

7:26 pm on Jul 10, 2006 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



>> the vastly inflated numbers are a fairly recent phenomenon, <<

I had seen hints of it at least two years ago, but I no longer have the data that would have confirmed or denied what was happening in mid-2003 too.

 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved