Forum Moderators: open
When you do a search on Google for any term that returns over 1,000 results, why does Google only let you see seven or eight hundred results?
It dose not matter if the term you searched for returns 1,000 results or 10,000,000 results Google only lets you see a few hundred...Why?
In other words, Google's software has to go through 10M records every time you click refresh on the page that would display 10,000,000-10,000,010th results.
That's just one "final" step that is required after all index has been searched and the pages were ordered (by relevance or anything else for that matter). And this thing alone would consume all google's resources.
[edited by: bcc1234 at 10:45 pm (utc) on Jan. 11, 2003]
No I am talking about normal searches - Why can't we see all the results? Why do we only get to see a very small percent of the sites returned?
<added>Bcc1234 if what you said were accurate, how do we get any search results at all?</added>
But I assume the way they generate serps by sending the query to many parallel boxes and then combining the result.
Let's say there are 5 parallel boxes that contain all index (the 3B pages), each box has 1/5 of the index.
The query goes to all 5 of them and they return 1,000 (or any other preset number) most relevant results from THEIR indeces.
So we get 5 lists from 5 different indeces.
After that, all 5 of them are combined and the final 1k results are sorted out from those records.
Why there is a limit? Well, it's easier to allocate memory for the list with "at most" set limit of records.
That way, if some of the 5 boxes' indeces did not have a single relevant page - it's just 4x1,000 or 3x1,000 etc.
On really specific terms it might be:
box 1 - 25 results
box 2 - 0 results
box 3 - 150 results
box 4 - 2 results
box 5 - 0 results
And the final list has 177 results.
But if the list is larger then it's truncated with the least relevant entries being left out.
I can't even imagine an efficient architecture that would allow to retrieve it all. After all, you would have to store it somewhere while it's being merged and served.