Forum Moderators: Robert Charlton & goodroi
40. A method, comprising: aggregating information regarding documents that have been removed by a group of users; and assign scores to a set of documents based on the aggregated information.
41. The method of claim 40, wherein aggregating information regarding documents that have been removed by a group of users includes: identifying a set of legitimate users and a set of illegitimate users; and collecting information regarding documents that have been removed by the set of legitimate users.
42. The method of claim 40, wherein aggregating information regarding documents that have been removed by a group of users includes: identifying a set of users with a defined relationship; and collecting information regarding documents that have been removed by the set of users.
Removing documents [appft1.uspto.gov]
Obviously, but what you forget is that Google is almost certainly comparing a site to it's peers, so in this context pages on different sites that rank well for 'train schedules' will likely have similar visit times.
Also to root out thin affiliate sites where a high % of visitors to site A end up (quickly or not) on site B.
There're an awful lot of possibilities we and other webmasters on other forums see in how we'd examine the information to track down certain things. User behaviour, site behaviour, spammers and so on.
But lately I came to think that if the search and the advertising network takes an infrastructure this big, and it still lacks in performance...
Things like Google tracking webmaster IPs in analytics, webmaster tools and adwords, adsense, things like Google using the toolbar data to its full potential, things like Google setting up a predicted pattern for every query...
Would be possible, but only one by one, and only in theory.
They'd need servers two to three times the amount they have right now to do all the things we assume they are doing right now.
The totalitarian ruler of search is but an utopia.
I say Google struggles on an everyday level as it is.
Not financially, but with its infrastructure, and how it can deal with the amount of data it collects.