Forum Moderators: Robert Charlton & goodroi
Yes, I see this kind of change, too. However the search for "SEO" is returning in the 900's for me right now. Still, it's not showing all 1,000. I'm glad you posted about this, because my sense is that this is a visible sign of a significant change on Google's back end, and not just an attempt to limit our access to their data.
Two questions I wonder about:
1. What criteria are used to choose the size limit for that initial result set?
2. Is this tied to some of the "re-ranking" we see, such as in the "-950 penalty"?
For #1, I'm considering that the preliminary result set that a query generates gets truncated at some spot determined by the "relevance scores" of the urls returned initially. Once those scores get too low, then Google won't include the url, even if it does, striclty speaking, contain the search terms.
For #2, (my thinking here is based on a number of patents) it's possible that Google is using more types of re-ranking over the preliminary result set. By truncating the size of the set, they've lowered computational overhead.
I did a search for news
1.1 billion results.
At the 6 page, i get always lovely "In order to show you the most relevant results, we have omitted some entries very similar to the 530 already displayed.
If you like, you can repeat the search with the omitted results included."
Click on that, and I get pages 1-5.
5th page gives me
Personalized Results 401 - 458 of 458 for news [definition]. (0.17 seconds)
However, Google does not exist to supply lots of data to the likes of you and me. They want to supply information to ordinary end-users, people searching for information. Those folks rarely care about page 19 of the results, so what Google is doing here does not impact them. If Google makes changes that allow them to answer more searches in a faster amount of time, the end-user will like that.
[edited by: tedster at 4:59 am (utc) on Oct. 18, 2007]
Querying an IP directly often gives different results than by letting google.com be direct through their load balancing and so forth. In this case, I get 608 initial results (3 more) and the same 742 when I include the omitted results.
No I don't think this is directly related to the 950 re-ranking being "turned up". Instead, my guess is that a shorter preliminary list makes such re-ranking computationally easier. People with the -950 still see their previous first page results at the end, but that end just comes quicker.
[edited by: tedster at 5:11 am (utc) on Nov. 1, 2007]
They should cut it down to 100, how many users ever go beyond that?
I think Google checked its data for the typical cut-off points by searchers and "cut the fat."
The deepest search ever I can recall from raw logs was about 720. However I rarely see past 30. I'm sure a lot of searchers don't even get to page 2 of results, but I'd like Google to share its data.
p/g
Using FF it's been fine, but I'm seeing a *very* wierd result for one search term (pure, unfiltered garbage), only in IE and it seems to be the data center I'm accessing with IE consistently, whether or not I'm logged in.
Here's the breakdown:
Search term leading to my site is on...
Page 1: 85.0%
Page 2: 7.1%
Page 3: 3.0%
Page 4: 1.4%
Page 5: 0.8%
Page 6: 0.6%
Page 7: 0.4%
Page 8: 0.2%
Page 9: 0.2%
Page 10: 0.2%
beyond Page 10: 1.1%
My stats also show that just about 8%-9% of the searches come from Page 3 or beyond. Tendency slightly decreasing over the past months. This is perfectly normal behaviour IMO. When using Google I think the SERPs on page 3 or beyond are of lower quality than the first two pages, and so I just do not think it's worth to try a result on these pages.
They should cut it down to 100, how many users ever go beyond that?
Is anyone else seeing this and what does it mean?
At the moment, there do not seem to be any significant changes in SERP results - at least not on pages 1 and 2.....
[edited by: tedster at 5:04 pm (utc) on Nov. 7, 2007]
[edit reason] moved from another location [/edit]
So we tried this with a long term which we eventually shortened down to just the single word computer which gives 872 results without ommitted - obviously not an accurate result if you're considering SEO difficulty. I didn't have an explanation for the guy as to why it was so few and he probably thought I was an idiot lol. Good to see a thread about this.
No more 600-999, did I miss something?
It's been like this for years on generic, ultra competitive searches.
I don't get it,... did *I* miss something?
1. What criteria are used to choose the size limit for that initial result set?
TrustRank.
OK, that's the method. Criteria are... well... SPAM and aggressive SEO activity. What else. *grin*
Anything that can manipulate relevancy and PageRank but not TrustRank. That's what it's for. See below.
2. Is this tied to some of the "re-ranking" we see, such as in the "-950 penalty"?
The -950 penalty, it has no connections with.
The rest... I don't think so... in what way could it be?
It's partially a manual setting though so...
...returning in the 900's for me right now
This threshold is virtually a runtime setting.
It can change several times a day, or be left alone for weeks.
In an area, generic travel sector, I watch an ultra competitive phrase that'll show any number of results between 97 (!) and 900'something.
...
The number of results displayed is based on the trust threshold set for a given search. The initial - relevancy + trust + whatnot - set is always much larger of course.
This will be first filtered for sites that don't clear the required parameter ( trust ), and only *then* comes the application of some rerankings.
Thus the list will only show results from *domains* and within them, *URLs* that are trusted.
Sometimes compromising some more relevant sites that'd have ended up at good positions otherwise...
And to newbies reading this, guess what, this is what we've been calling the 'sandbox effect'.
Trust increases as links/sites age, thus sometimes sites with no *new* links suddenly clear the threshold. But it's not a time limit.
The final list will show results from only the most trusted domains... Apply the &filter=0 parameter and you did a site: search on them, but only them.
The top list is always clipped at a point that feels convenient, more often than not because after that position, there's some irregular SPAM activity ( or too many people reaching the 'pro-level' in any other 'unfair' way - Google's interpretation of fair, not mine. ) It's effective allright. A little too effective in some cases, but most of the time I'm thankful it's there.
Rerankings like the -950 are applied in a very interesting way. A site that would be #1, but is filtered to be -950 will show up #355 on SERPs that end at position 355. You know, that's because sites that go -950 ( for popular 1,2 word phrases ) are in fact, trusted. As I've been saying for almost a year. But whatever.
...
Or was this all off topic?
[edited by: Miamacs at 2:14 pm (utc) on Nov. 8, 2007]
I assume these are errors in the estimates. This effect has happened before, several times, over the last few years.
I think that at least one of the significant changes in the reported counts, a year or two back, co-incided with a major update to Supplemental Results.