pontifex - 7:57 pm on May 23, 2010 (gmt 0)
Tedster: that makes perfect sense and relates to the patents you mentioned, too. Giving weight to sentences or "snippets" of sentences that give a pretty good idea of the intention of the user would also explain the long tail change.
It also means that pages with more sentences would get more traffic - strange enough: this favors articles for SEO reasons over casual content and explains the rise of spammy sites, too...
"red widgets for something"
"red widgets that you can buy and own for something"
if all other aspects of both pages are roughly the same (age, link juice, etc.) and I just "like" that theory much more than the AI approach, because it is simpler and as I know Google, they want to achieve a lot with little effort :-)
So, besides their database of "common keywords", which might covered around 3 Mio. terms in English, they pumped it up to 30-40 Mio. terms and try to match that better...
As always: just wild guessing, but from my monitoring, I got 1.8 Mio. different phrases from live searches over the time in German. More people search in English - so lets just round it up and say, Google was pre-calculating with 3 Mio. phrases, needing special attention, because they are typed in more than twice in a month.
Now (and they do that, because we know Google Trends, don't we) they have recorded all search terms over the years and thought: well, 3 Mio. is not good enough, lets add 2 or 3 words to all these queries that we store and calculate relevance for those phrases for our main index.
I like the theory, it makes sense and it explains what I am seeing in the long tail! It would also enhance Google Trends, which I will take a look at now :-)