|No more "searching X billion pages"|
Just saw it was gone
I just saw that the Google homepage no longer mentions some weird number of pages that the search engine is supposed to have included in the index.
How new is this?
It's been a while now. I think it vanished around the time that it became very clear they were counting urls (what is a "page" anyway?) and had started to index all kinds of duplicate and spurious urls - anything that had a chance of resolving.
G and Y both seemed to give up on "bigger is better" right at the same time. The number I remember is 18 billion or some such insanity. Sorry, I can't come up with a date.
Google has had a very hard time accurately counting/reporting the number of pages in their index for ... well, ever since they began counting/reporting the number of pages in their index! At least they recognized the fact that it was a silly and completely meaningless (not to mention inaccurate) bit of self promotion and have removed it!
In March, 2005 Google claimed the following:
©2005 Google - Searching 8,058,044,651 web pages
Here's an old thread which you may find interesting (you may even remember participating) Claus: [webmasterworld.com...]
The last cached page that the WayBack Machine has on record for Google (or any other site for that matter) is from March 31, 2005. Google still showed the number of pages indexed at that point, but to be honest, I never noticed it had disappeared as I never believed the figures anyway!
By the way, do any of you remember this from December, 1998? Its worth a giggle!
And I just love this page:
OT - Am I the only one around here who wasn't aware that the Wayback Machine died on April 1, 2005? Too bad, it was a really good/fun tool!
|OT - Am I the only one around here who wasn't aware that the Wayback Machine died on April 1, 2005? Too bad, it was a really good/fun tool! |
OT - :( :( :( I really liked it. But where did you read about it? I can't find this information anywhere... ah - now I can see that there are no caches after that date, but maybe these are just temporary problems?
Hi konrad ... somehow I have a hard time believing that 13 months might be a "temporary" thing! :(
"Why are there no recent archives in the Wayback Machine?
We do not add pages less than 6 months after they are collected, because of the time delayed donation from Alexa. Updates can take up to 12 months in some cases. "
thats why its called the 'way back' machine LOL
Or try this:
My alma mater makes page 1 out of 25 billion urls! Of course, it's not all that well targeted.
Ah, so it's not just me that can't see it anymore, that's good to know ;)
As we only get =< 1,000 results returned anyway and have no possibility for control, strictly speaking they could just generate a random number.
I have always had the assumption that Google only indexed "some fraction" of the pages on the www, plus that a lot of "pages" were really just different URLs, not different pages.
However, I have assumed this fraction to be more or less constant (actually growing, but not as fast as the web grows), so that you could use the figure as a very rough indicator for the growth rate of the web, in terms of published documents.