Forum Moderators: open
P.S. The total results returned has steadily increased over the past few months. I feel this is in anticipation of the IPO. One of those "our index is bigger than theirs" type things.
<snip>
[edited by: WebGuerrilla at 1:25 am (utc) on Dec. 4, 2003]
[edit reason] Sorry, no tools [/edit]
Scoring and ordering more results wouldn't increasse time linearly, but exponently. If ordering 1000 results takes 10 ms, ordering 10000 would take 220 ms. Working with more real numbers: a search may take 0.20 seconds, and it orders 1000 results. If it has to order 170k results, it would take almost two hours. Ordering 2 million results would take 25 years of continuous work...
For this reason, G gives a result enough but not averwhelming. Nobody who searches wants to wait some years to get the results...
Greetings,
Herenvardö
The problem with this is that it is a recursive algorythm, and it needs a huge amount of memory to work. For 170k results, it would need 27Gb of memory. For the 2m results, there would be needed 4Terabytes (4000Gb)! For a thousand results, it only takes 0.2s and uses 1mb of memory.
Somebody wants to know how memory would need G if they have to rank all the pages in their index for a search? ;)
The task would need more than 9 ExaBytes (1 Exabyte = 1024^3 Gb ~ a billion gigabytes) and it would take 66 years.
Using the slow algorythm, it would need less memory (only 3Gb), but it would take more than 300 billion years!
Greetings,
Herenvardö
[edit]note for non-english people: a billion in English means a thousand millions. In other languages it means a million of millions. I used the term in english, so take the english meaning;)[/edit]
Don't forget that google openly admits that it saves the queries we have previously made courtesy of the cookie stored on our harddrive. Each time we do a search, it can search that table first to get the top listing sites quicker and faster. It could be a "cached" page as well. Never know.
If you query "mailto:info@" at G it says they have over 400K page returns. Spammers would have a field day using a simple perl mod designed to harvest email addresses. You'll find these limits at other engines as well.
1) No reason for google to offer their database to every user who wants it.
2) Search results are on a given input. Google ranks it as best it can and gives you the best listings. The rest of the listings don't count for that search, or don't count enough to matter as far as google is concerned.
3) Google makes claim to 3 billion indexed pages, and they do indeed have that many to search through. When you do a search, it isn't searching through only 1000 listings, it's searching through pretty much all of them in order to get you your 1000. There is no misleading claim, in fact, pretty much all the search engines have a hard limit on the number of results you can actually see.