|Google Patent: Using Usage Statistics in Search|
Dean, Henzinger, Bharat et al
| 9:29 am on Aug 5, 2005 (gmt 0)|
U.S. Patent Application dated September 5, 2002 bearing the names of a few of the usual suspects, with Google, Inc. as assignee:
|Dean, Jeffrey A.; (Menlo Park, CA) ; Gomes, Benedict; (Berkeley, CA) ; Bharat, Krishna; (Santa Clara, CA) ; Harik, Georges; (Mountain View, CA) ; Henzinger, Monika H.; (Menlo Park, CA) |
Methods and apparatus for employing usage statistics in document retrieval [appft1.uspto.gov]
|Methods and apparatus consistent with the invention provide improved organization of documents responsive to a search query. In one embodiment, a search query is received and a list of responsive documents is identified. The responsive documents are organized based in whole or in part on usage statistics. |
Not to get too tin-foil-hattish, but we've all seen periodic click-tracking in the SERPs, so the data is obviously used for something. And we've been hearing of people with new sites having decent linkage appearing in the SERPs for a short time and then dropping down into oblivion, while others insist there's no sandbox, no time delay, no nothing - that it's just the optimization - because they don't experience the same thing.
| In one embodiment, a search query is received and a list of responsive documents is identified. The list of responsive documents may be based on a comparison between the search query and the contents of the documents, or by other conventional methods. Usage statistics are determined for each document, and the documents are organized based in whole or in part on the usage statistics. These usage statistics may include, for example, the number of visitors to the document (perhaps over a period of time), the frequency with which the document was visited (perhaps over a period of time), or other measures. |
So first there's two - text or word matching for relevancy, probably linking, and then a third factor - traffic. Is it the optimization, or could it possibly have something to do with which sites show acceptable "usage statistics" or not that plays some deciding part in whether they stay or drop into what's come to be known as the sandbox?
Tinfoil hat back on, would Adwords clicks be tallied in, for those who advertise, and maybe possibly Adsense impression & clickthrough stats be a factor for those who get quick rankings at Yahoo and MSN and run Adsense on new sites?
| 4:39 pm on Aug 5, 2005 (gmt 0)|
interesting. If they just took the volume of traffic as ALEXA does it would be saying -- busy = best. Plus lots of sites are very profitable at 50 uniques a day and have laser sharp relevancy.
The traffic would have to be co-related to keywords in the same way directhit did years ago.
OR -- they are looking at how much traffic for which keywords and then analyzing the group of keywords a site gets traffic for.
| 6:48 am on Aug 7, 2005 (gmt 0)|
When they are tracking clicks in the SERPs they know exactly which keywords they're for.
| 6:23 pm on Aug 7, 2005 (gmt 0)|
|And we've been hearing of people with new sites having decent linkage appearing in the SERPs for a short time and then dropping down into oblivion |
We experienced this...
| 7:24 am on Aug 15, 2005 (gmt 0)|
Let's say you owned a search engine and setup tracking to monitor behavior including outbound clicks.
When people searched for "red widgets", you noticed that they often clicked on results (just as users do for other queries), however these users seemed to return to the result listings page quickly (units of time). The cycle repeated throughout the majority of the listed results for the query "red widgets". You found that these users also had a high rate of repeated searches for nearly identical terms/phrases.
Now, when users search for a very similar term (same market), "blue widgets", they returned to the listing pages less often, not at all, or to perform another (unrelated) search.
A/B testing these types of things over time would yield behavior patterns that could predict the quality of results without ever having to analyze the pages themselves.