Forum Moderators: open
Much to much mess out there to search through.
Most of my sites are getting lost in the thousands of pages of auto-generated nonsense :(
However, well done G on having the power ;).
It looks like Chndru and I were wrong [webmasterworld.com]. I have to wonder though, how many of those have been crawled? Google has a lot of 'URL-only' listings.
Would be nice to see that number going down IMO.Much to much mess out there to search through.
Most of my sites are getting lost in the thousands of pages of auto-generated nonsense :(
However, well done G on having the power ;).
Exactly. I see the grist for dozens, if not hundreds, of media and tech columns in this new number. Not that the old number was any less daunting, but the change begs the question--what are we searching for? Is more better? Or is better better?
So, OK, media, tech, religion and philosophy columns--expect to here about this from the pulpit in coming Sundays. (My favorite quote about the web--I can't remember who said it, alas: "The Internet has answered the question regarding a million monkeys, given an infinite amount of time, eventually writing a great novel. The answer is no.")
Google Inc. today announced it expanded the breadth of its web index to more than 6 billion items. This innovation represents a milestone for Internet users, enabling quick and easy access to the world's largest collection of online information.
yup,ciml. maybe yahoo's drop of G has triggered the change.
anyways, i always wondered what these 4.28+ billion documents represent.
All the spidered web pages? Does it include the webpages that are of duplicate content? Does it include different session IDs of the same page? And the www and non-www version of the pages? And translated pages? graphical and non-graphical version of the pages?
Given all these, would the meaningful content be boiled down to around 1/5th of the index?
GOOGLE ACHIEVES SEARCH MILESTONE WITH IMMEDIATE ACCESS TO MORE THAN 6
BILLION ITEMSGoogle Connects Searchers to World's Most Comprehensive Index; Increases Web
Page and Image CollectionsMOUNTAIN VIEW, Calif. - Feb. 17, 2004 - Google Inc. today announced it
expanded the breadth of its web index to more than 6 billion items. This
innovation represents a milestone for Internet users, enabling quick and
easy access to the world's largest collection of online information."People worldwide can find more information with Google than with any other
search engine," said Larry Page, Google co-founder and president of
Products.Google's collection of 6 billion items comprises 4.28 billion web pages, 880
million images, 845 million Usenet messages, and a growing collection of
book-related information pages. Web surfers worldwide can now search across
Google's collection of items using the following services:- Google Web Search: The company's flagship search service now offers 4.28
billion web pages. Google's powerful and scalable technology searches this
information and delivers a list of relevant results in an instant. Google
Web Search also enables users to search for numerous non-HTML files,
including PDF, Microsoft Office, and Corel documents.- Google Image Search: Comprising more than 880 million images, Google Image
Search enables users to find electronic images relevant to a wide variety of
topics.
Advanced features include search by image size, format (JPEG and/or GIF),
coloration, and the ability to restrict searches to specific sites or
domains.- Google Groups: This 20-year archive of Usenet conversations is the largest
of its kind and serves as a powerful reference tool, while offering insight
into the history and culture of the Internet. Google Groups offers more than
845 million postings in more than 35,000 topical categories.- Google Print: A test service that enables Google users to immediately
access a range of book related information, such as first chapters, reviews,
and bibliographic information. These pages also offer users links to
directly purchase titles."Google Image Search has been significantly updated," said Sergey Brin,
Google co-founder and president of Technology. "We've doubled the index to
more than 880 million images, enhanced search quality, and improved the user
interface."Today's news follows the announcement last week that Google received eight
awards in the 4th Annual Search Engine Watch Awards, which recognize
outstanding achievements in web searching. Google was recognized as the
"Outstanding Search Service," for helping internet users locate information
from across the Web. Google has received this distinction every year since
the awards were initiated in 2000. Google AdWords was also given top honors
for value, targeting, tools and overall advertiser satisfaction.About Google Inc.
Google's innovative search technologies connect millions of people around
the world with information every day. Founded in 1998 by Stanford Ph.D.
students Larry Page and Sergey Brin, Google today is a top web property in
all major global markets. Google's targeted advertising program, which is
the largest and fastest growing in the industry, provides businesses of all
sizes with measurable results, while enhancing the overall web experience
for users. Google is headquartered in Silicon Valley with offices throughout
North America, Europe, and Asia. For more information, visit www.google.com.# # #
Google is a trademark of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.
(My favorite quote about the web--I can't remember who said it, alas: "The Internet has answered the question regarding a million monkeys, given an infinite amount of time, eventually writing a great novel. The answer is no.")
That'd be Robert Wilensky (it's one of my faves too - I keep it on a notecard as a reminder):
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true.
(And this was before blogs!)
It's kind of like panning a river for gold. If you have a wee stream, does it necessarily have more gold than a huge river? Perhaps, perhaps not - but how much of the gold can you catch? Now that's the question...
Every other field has the same problem. Think of all the pulp that comes out of book publishers. You might have thousands of titles of pure kindling, and only dozens that might be more than the book version of spam and nonsense. Noise always outweighs signal - but how well do you listen?
Or maybe I'm just getting too deep on not enough coffee.
So many sites can expect their pagerank to decrease (as a result of the additional pages, with no direct influence on the ranking).I don't think anything will be reflected in this. They didn't just push a button and whoila, now were at 4.3 billion. They simply updated the text on their home page. Keep in mind they have been counting the pages since November '02.
<added>When I started typing the above, GG's post wasn't there. Coincidence? Not with all these black spaceships hovering outside.;)</added>
According to this search, it is more like 500,000,000. But I guess that is not only webpages, but all the files in their index (PDFs, PPT, DOC, etc).Patrick, you just searched for all the pages that include the word site, but don't have the -igdjhdsdsz stuff.
That'd be Robert Wilensky (it's one of my faves too - I keep it on a notecard as a reminder):
We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the entire works of Shakespeare. Now, thanks to the Internet, we know this is not true.and that was before blogs.
Thanks for the info--and the laugh.
GG: First, congrats! Now, how about an example of how the images index is better, eh? "Now you'll find a more complete and useful selection of -------."