Forum Moderators: Robert Charlton & goodroi
Google Hummingbird and Related Patents
The announcement of the new algorithm told us that Google actually started using Hummingbird a number of weeks ago, and that it potentially impacts around 90% of all searches.
It’s being presented as a query expansion or broadening approach which can better understand longer natural language queries, like the ones that people might speak instead of shorter keyword matching queries which someone might type into a search box.
[seobythesea.com...]
[patft.uspto.gov...]
Evaluation of Substitute TermsThe process used to find substitute terms focuses upon the use of the co-occurrence of words found on pages returned in response to a query, and to a potential substitute query. These candidate substitute terms might originally show up in documents ranking for the first query term, or in meta data associated with those documents.
For example, to find a potential substitute query terms for “cats,” terms that appear in documents ranking for “cats” may be explored. One of those might be “feline.” If we perform a search for “cats”, and look through the top 10 (or top 20, or even top 100) results for words that tend to co-occur on those pages, we might see words such as “furry”, “domesticated”, “carnivorous” and ” mammal” appear on a lot of the top pages returned for that query. If those are terms that tend to co-occur often in the results on a search for “cats,” they are considered co-occurring terms.
[seobythesea.com...]
[patft.uspto.gov...]
Generalized Edit Distance for QueriesThis patent looks for co-occurring words within search sessions instead of on web pages or within search results for particular queries.
---
Query terms that might be similar are selected in part on how closely they might be related semantically. For example, It’s much more likely to see “become a dentist” followed by a query for “become a dental assistant,” instead of being followed by “become a doctor.” in a set of query sessions. It’s likely that we’ll see people change their queries in such a manner when they are performing searches in a search session.
[seobythesea.com...]
[appft.uspto.gov...]
Search Entity Transition Matrix and Applications of the Transition MatrixThese search entities can include:
- A query a searcher submits
- Documents responsive to the query
- The search session during which the searcher submits the query
- The time at which the query is submitted
- Advertisements presented in response to the query
- Anchor text in a link in a document
- The domain associated with a document
[seobythesea.com...]
[patft.uspto.gov...]
When I sit and think about all the preceding concepts and how they could work together it seems to explain quite a bit of what appear to be inconsistencies and oddities in the results people are reporting lately.
Thanks also to Bill Slawski, who over the years has kept on top of this material, and to whom we all owe a great debt.
Hummingbird might be limited to these, but I'm thinking that it necessarily goes beyond them (and I'm guessing that you would agree with that.)
longer term personalization, and some measures of user satisfaction... building on what's come before... also play a large part in the overall algo
[edited by: Robert_Charlton at 5:00 am (utc) on Oct 13, 2013]
[edit reason] added missing section, per poster's request [/edit]
Siteowners & SEO's need to adjust their thinking to be more aligned to this, to meet their clients needs IMO
Fast forward 6 years or so, and Google has built up vast repositories of data on individual sites and query intent based on what users have been typing in, and how they've been responding, making the former somewhat redundant.