zerillos - 2:17 am on Dec 16, 2011 (gmt 0)
@ted. Learning is just half of the story. Pls bare with me on this one (Iím an engineer, so I'm not very good with words :) ).
The key word in this is "context". It's like the transition from structured programming to OOP. Regex (regular grammar) is also a good concept to keep in mind.
In my personal opinion, and of a few others, Google is going for contextual search. This is the first step towards AI (understanding human language, concepts and the context in which every word is used). At this step every word is an object. Let's take "Tweet" for instance. It has an obvious meaning for the moment (the social media outlet), it can mean the sound of a bird, or a cartoon character. The "meaning" is another object, which has several properties like timing, previous searches, hot subjects of the moment, etc. Going further down the rabbit hole, previous searches can be categorized too. To go even further, we can add user behavior, ratio of ads / content, website age, website general vocabulary, level of speech, quality of images, hosting service quality, errors, well formed programming, HTTP errors, standards compliance and every single angle covered here since Feb 2011 (and i suspect a few more others too). All these can be added as properties to define a website. Let's say you have defined a website regarding all these (and other) aspects. This means you can define them all. Going further down the hole you can define whole verticals, market segments, etc. Theoretically, this concept can be applied to every single level, from 1 to Layer 7.
If I wanted to go after spam and all the crap filling the web today, this would be the first choice in my mind. Volume, and taking everything into consideration in relation to everything else.
do I make any sense at all?