Forum Moderators: open

Message Too Old, No Replies

Results found VS. total estimated

I have a site that's barely still measurable.

         

killroy

11:22 am on Nov 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



This has been puzzling me for a while. The discrepancy between the last result found and the number estimated.

For a while I was follwing a newsite, and Ifirst believed that the result estimate was my total number of pages as google saw them from links, while the last page in the results (the BBB in "Results AAA-BBB of about CCC") was the actual pages indexed.

This was shattered when the CCC beame larger then my real number. Su currently the BBB is jsut over 500, the REAL number is around 740 and CCC is around 940.

Now I have another site ad the edge of being measurable, i.e. BBB is 944 but CCC is 20,000. This doesn'T make sense. Unfourtunately I canT' exactly say HOW many pages there are (it's a YP directory site, so can be in excess of a few 10,000)

This is all done with
q=site:www.domain.com+-realgibberish&num=100&start=900&filter=0

CCC does fluctuate widely though (25%+-)

Whitch numbers cna I rely on? Is BBB truly what Google sees? Or can keyword specific searches bring up more then BBB different pages?

SN

killroy

1:41 pm on Nov 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



*bump*

davidpbrown

2:06 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



I wonder if a stagnent site would have a more accurate guess. Maybe it's a reflection of BBB + proportional to how much Google doesn't know about recently updated pages and potentially new pages.

Can't think of an easy way to find stagnent sites.. maybe AltV has a few.

dpb

Sharper

6:03 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



I routinely see
allinurl:site.com &filter=0
and
site:site.com -asdggh & filter=0
searches return estimated results greater than the total number of pages a site has, many times almost double the number the site actually has.

This is with sites whose total number of actual pages is anywhere from 1200 to 50,000+. I've taken into account possible www vs. non-www and other sites with the url type issues and those don't explain it.

It's hard to be exact when you can't see all the actual results from the page, but I've pretty much satisfied myself that there seems to be a correlation between the number of estimated pages and the number of pages in the last "major" index update added to the number of pages spidered since then.

In other words, it seems like some pages get counted in the estimated results twice when they were in the index already, then got recently spidered.

However, in the actual results, those pages seem to only actually show up once.

plasma

6:14 pm on Nov 24, 2003 (gmt 0)

10+ Year Member



One of my customers has a shop with about 30000 pages.
~6000 pages have been crawled recently and also ~6000 pages are estimated.

But only ~550 hits are shown. (site:... -gfgfdsgfgfds)

So in my case, the estimation is correct, but not all pages are shown.