homepage Welcome to WebmasterWorld Guest from 54.166.123.2
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

This 60 message thread spans 2 pages: 60 ( [1] 2 > >     
Searching 4,285,199,774 web pages
panos




msg:73503
 12:16 pm on Feb 17, 2004 (gmt 0)

Have you noticed that on google.com?

"Searching 4,285,199,774 web pages"

 

TinkyWinky




msg:73504
 12:33 pm on Feb 17, 2004 (gmt 0)

That's quite an increase - that'll be all the spam pages that are inlcuded without any Pigeon Rank then?

xcandyman




msg:73505
 12:37 pm on Feb 17, 2004 (gmt 0)

Finally the number increases. Now lets wait for a couple of days and another engine will up their tally

engine




msg:73506
 12:37 pm on Feb 17, 2004 (gmt 0)

Seemed to flip over at midday, UK time.

takagi




msg:73507
 12:56 pm on Feb 17, 2004 (gmt 0)

The counter for images increased even more!

425,000,000 -> 880,000,000 images (+107%)
3,307,998,701 -> 4,285,199,774 web pages (+30%)

So now they can really say "The most comprehensive image search on the web."

For some time AltaVista had a higher number!

Dayo_UK




msg:73508
 1:00 pm on Feb 17, 2004 (gmt 0)

Would be nice to see that number going down IMO.

Much to much mess out there to search through.

Most of my sites are getting lost in the thousands of pages of auto-generated nonsense :(

However, well done G on having the power ;).

ThomasB




msg:73509
 1:08 pm on Feb 17, 2004 (gmt 0)

I'm sure they could double the number just by deep-spidering all the datafeed-sites. :) I'd suggest them putting some quotes from WW members like "Best index ever!" on google.com. :)

andy_boyd




msg:73510
 1:12 pm on Feb 17, 2004 (gmt 0)

Dayo, I agree.

Fair enough, we can now search through billions of documents, but the average user will only ever see the first 30 results at the most and they need to be the best.

Quality not quantity.

ciml




msg:73511
 1:21 pm on Feb 17, 2004 (gmt 0)

Andy, a bigger index is important for precise, multi-word searches where there are very few matching documents.

It looks like Chndru and I were wrong [webmasterworld.com]. I have to wonder though, how many of those have been crawled? Google has a lot of 'URL-only' listings.

258cib




msg:73512
 1:31 pm on Feb 17, 2004 (gmt 0)

Would be nice to see that number going down IMO.

Much to much mess out there to search through.

Most of my sites are getting lost in the thousands of pages of auto-generated nonsense :(

However, well done G on having the power ;).

Exactly. I see the grist for dozens, if not hundreds, of media and tech columns in this new number. Not that the old number was any less daunting, but the change begs the question--what are we searching for? Is more better? Or is better better?

So, OK, media, tech, religion and philosophy columns--expect to here about this from the pulpit in coming Sundays. (My favorite quote about the web--I can't remember who said it, alas: "The Internet has answered the question regarding a million monkeys, given an infinite amount of time, eventually writing a great novel. The answer is no.")

Chndru




msg:73513
 1:52 pm on Feb 17, 2004 (gmt 0)

Google Inc. today announced it expanded the breadth of its web index to more than 6 billion items. This innovation represents a milestone for Internet users, enabling quick and easy access to the world's largest collection of online information.

[home.businesswire.com...]

yup,ciml. maybe yahoo's drop of G has triggered the change.

robertskelton




msg:73514
 1:53 pm on Feb 17, 2004 (gmt 0)

Reality: if it were not true, only 0.01% of searchers would hear about it.

Reality: only 0.01% of searchers would care

However, it means that Google has indexed far more pages than any other search engine ever. Expect AlltheWeb to announce 4.4 billion pages indexed very soon....

zgb999




msg:73515
 1:57 pm on Feb 17, 2004 (gmt 0)

So many sites can expect their pagerank to decrease (as a result of the additional pages, with no direct influence on the ranking).

kaled




msg:73516
 2:22 pm on Feb 17, 2004 (gmt 0)

The total is still below the limit of 32bit arithmetic - but only just.

Kaled.

Chndru




msg:73517
 2:39 pm on Feb 17, 2004 (gmt 0)

>32bit arithmetic
ahh..that put-to-rest conspiracy theory.

anyways, i always wondered what these 4.28+ billion documents represent.

All the spidered web pages? Does it include the webpages that are of duplicate content? Does it include different session IDs of the same page? And the www and non-www version of the pages? And translated pages? graphical and non-graphical version of the pages?
Given all these, would the meaningful content be boiled down to around 1/5th of the index?

Bobby_Davro




msg:73518
 3:02 pm on Feb 17, 2004 (gmt 0)

And does it include all those pages in Google that consist only of a URL but no cached page? (and there are one heck of a lot of those)

brotherhood of LAN




msg:73519
 3:05 pm on Feb 17, 2004 (gmt 0)

>30% increase

I wonder if they're indexing a larger percentage of the web as a whole, or still falling behind with its growth.

IITian




msg:73520
 3:42 pm on Feb 17, 2004 (gmt 0)

Is this number more than the number of pages in amazon?

Brett_Tabke




msg:73521
 4:53 pm on Feb 17, 2004 (gmt 0)

used with permission:

GOOGLE ACHIEVES SEARCH MILESTONE WITH IMMEDIATE ACCESS TO MORE THAN 6
BILLION ITEMS

Google Connects Searchers to World's Most Comprehensive Index; Increases Web
Page and Image Collections

MOUNTAIN VIEW, Calif. - Feb. 17, 2004 - Google Inc. today announced it
expanded the breadth of its web index to more than 6 billion items. This
innovation represents a milestone for Internet users, enabling quick and
easy access to the world's largest collection of online information.

"People worldwide can find more information with Google than with any other
search engine," said Larry Page, Google co-founder and president of
Products.

Google's collection of 6 billion items comprises 4.28 billion web pages, 880
million images, 845 million Usenet messages, and a growing collection of
book-related information pages. Web surfers worldwide can now search across
Google's collection of items using the following services:

- Google Web Search: The company's flagship search service now offers 4.28
billion web pages. Google's powerful and scalable technology searches this
information and delivers a list of relevant results in an instant. Google
Web Search also enables users to search for numerous non-HTML files,
including PDF, Microsoft Office, and Corel documents.

- Google Image Search: Comprising more than 880 million images, Google Image
Search enables users to find electronic images relevant to a wide variety of
topics.
Advanced features include search by image size, format (JPEG and/or GIF),
coloration, and the ability to restrict searches to specific sites or
domains.

- Google Groups: This 20-year archive of Usenet conversations is the largest
of its kind and serves as a powerful reference tool, while offering insight
into the history and culture of the Internet. Google Groups offers more than
845 million postings in more than 35,000 topical categories.

- Google Print: A test service that enables Google users to immediately
access a range of book related information, such as first chapters, reviews,
and bibliographic information. These pages also offer users links to
directly purchase titles.

"Google Image Search has been significantly updated," said Sergey Brin,
Google co-founder and president of Technology. "We've doubled the index to
more than 880 million images, enhanced search quality, and improved the user
interface."

Today's news follows the announcement last week that Google received eight
awards in the 4th Annual Search Engine Watch Awards, which recognize
outstanding achievements in web searching. Google was recognized as the
"Outstanding Search Service," for helping internet users locate information
from across the Web. Google has received this distinction every year since
the awards were initiated in 2000. Google AdWords was also given top honors
for value, targeting, tools and overall advertiser satisfaction.

About Google Inc.
Google's innovative search technologies connect millions of people around
the world with information every day. Founded in 1998 by Stanford Ph.D.
students Larry Page and Sergey Brin, Google today is a top web property in
all major global markets. Google's targeted advertising program, which is
the largest and fastest growing in the industry, provides businesses of all
sizes with measurable results, while enhancing the overall web experience
for users. Google is headquartered in Silicon Valley with offices throughout
North America, Europe, and Asia. For more information, visit www.google.com.

# # #

Google is a trademark of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.



antsaint




msg:73522
 5:43 pm on Feb 17, 2004 (gmt 0)

(My favorite quote about the web--I can't remember who said it, alas: "The Internet has answered the question regarding a million monkeys, given an infinite amount of time, eventually writing a great novel. The answer is no.")

That'd be Robert Wilensky (it's one of my faves too - I keep it on a notecard as a reminder):

We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true.

(And this was before blogs!)

It's kind of like panning a river for gold. If you have a wee stream, does it necessarily have more gold than a huge river? Perhaps, perhaps not - but how much of the gold can you catch? Now that's the question...

Every other field has the same problem. Think of all the pulp that comes out of book publishers. You might have thousands of titles of pure kindling, and only dozens that might be more than the book version of spam and nonsense. Noise always outweighs signal - but how well do you listen?

Or maybe I'm just getting too deep on not enough coffee.

GoogleGuy




msg:73523
 5:59 pm on Feb 17, 2004 (gmt 0)

Personally, my favorite is the 880M images. The bump from ~400M plus freshening the data makes it much more useful.

Powdork




msg:73524
 6:07 pm on Feb 17, 2004 (gmt 0)

So many sites can expect their pagerank to decrease (as a result of the additional pages, with no direct influence on the ranking).
I don't think anything will be reflected in this. They didn't just push a button and whoila, now were at 4.3 billion. They simply updated the text on their home page. Keep in mind they have been counting the pages since November '02.
I wonder if that was the project GoogleGuy was working on.
4,285,199,772
4,285,199,773
4,285,199,774
Done! Notice how he's been posting more lately.;)

<added>When I started typing the above, GG's post wasn't there. Coincidence? Not with all these black spaceships hovering outside.;)</added>

PatrickDeese




msg:73525
 6:39 pm on Feb 17, 2004 (gmt 0)

According to this search [google.com], it is more like 500,000,000. But I guess that is not only webpages, but all the files in their index (PDFs, PPT, DOC, etc).

EliteWeb




msg:73526
 7:02 pm on Feb 17, 2004 (gmt 0)

I love the image search. :D
The index has seriously grown, I wonder if Google went even deeper indexing on some major sites to pull some figures like that so quickly.

jimh009




msg:73527
 7:52 pm on Feb 17, 2004 (gmt 0)

> Personally, my favorite is the 880M images. The bump from ~400M plus freshening the data makes it much more useful.

Have to agree on that. The recent update for Google images really improved things. It is much easier to find good pictures after the image update.

Jim

SyntheticUpper




msg:73528
 8:01 pm on Feb 17, 2004 (gmt 0)

Pagerank can't decrease because of more pages - the maths sorts this out. It does lead to more competition though, but the comments about monkeys and typewriters should comfort us all a little ;)

Powdork




msg:73529
 8:05 pm on Feb 17, 2004 (gmt 0)

According to this search, it is more like 500,000,000. But I guess that is not only webpages, but all the files in their index (PDFs, PPT, DOC, etc).
Patrick, you just searched for all the pages that include the word site, but don't have the -igdjhdsdsz stuff.
Try this search. [google.com]

1milehgh80210




msg:73530
 9:52 pm on Feb 17, 2004 (gmt 0)

In a few years the average will be several webpages/for each earth inhabitant.
Being a web-publisher means being a member of such a special & select club..

258cib




msg:73531
 9:54 pm on Feb 17, 2004 (gmt 0)

That'd be Robert Wilensky (it's one of my faves too - I keep it on a notecard as a reminder):
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true.

and that was before blogs.

Thanks for the info--and the laugh.

GG: First, congrats! Now, how about an example of how the images index is better, eh? "Now you'll find a more complete and useful selection of -------."

Yidaki




msg:73532
 12:13 am on Feb 18, 2004 (gmt 0)

>Searching 4,285,199,774 web pages
>>that'll be all the spam pages that are inlcuded without any Pigeon Rank then?

A rising tide lifts all boats.

This 60 message thread spans 2 pages: 60 ( [1] 2 > >
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved