Welcome to WebmasterWorld Guest from

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google's Query Expansion Capabilities: Observations

8:12 pm on Oct 18, 2008 (gmt 0)

Senior Member

joined:Jan 27, 2003
votes: 0

What you search is not always what you get

I would consider myself a fairly precision searcher - I know exactly what I'm typing. Nonetheless, Google often searches for what they think I really meant, as opposed to what I actually entered a process sometimes known as query expansion.

Common examples of query expansion:

Word stemming

Google introduced word stemming at least five years ago [widgets] matches [widgets], [widgeting] and [widgeteering]. The keyword entered is reduced to a root or 'stem' ('widget' in the examples above) and words starting from the same stem can be matched.


An [FAQ] is a set of [frequently asked questions]. Google has impressive mappings of acronyms and initialisms to the full phrase. I think this would be an interesting database if it was ever made available.

Mis-spellings and typos

If you make an obvious typo, Google can include sites that only use the correctly-spelled word (in addition to the "did you mean:" prompt). This is much more obvious with certain queries.

Less common examples of query expansion:


Synonyms may be too narrow a definition, since the search operator for synonyms (~) reveals words that seem to have been derived from co-occurrence data, and have very distinct meanings. Nonetheless, it seems to be possible for Google to expand your query to include related words.


I see reflections of this in the interesting search result translation [translate.google.com] service. In some instances, Google seems to translate search keywords into other languages and return results from that language. I haven't really pinned down the pattern as to which queries (and pages) get this treatment, but I've seen quite a few examples where non-English keywords match English pages.

Ignored words

I occasionally see searches where words appear to have been dropped completely from the query. It's possible that certain keywords might be deemed to lack significance, and can return results with those words omitted from the search. I've only seen a few examples that point directly to this behaviour.

Interestingly, not all content in the index get the query-expansion treatment. I've seen results that suggest a more wide-reaching characteristic of URLs likely to get fuzzy-matching, but that's probably for another day ;)

In most of the common cases, it seems clear that rewriting of the search query can occur, even if the expanded words are not:

  • within the on-page content
  • within links to the URL
  • or even in text adjacent to links to the URL

Of course, in many cases one of the above conditions is true, which can make finding true examples of query expansion much more difficult. In addition, many of the processes involved seem to be based on aggregated data from content within the index, or based on user search behaviour - which means that there is more or less useful data available depending on the popularity and frequency of occurrence of the search keyword.

Whether query expansion occurs also seems to be related to the entire search query - certain formulations are much more likely to trigger expansion that others. Possibly this has both linguistic (e.g. not expanding a word that is used as part of a common phrase) and statistical (e.g. based on user behaviour) aspects.

Does anyone know any other examples of Google's query expansion capabilities, or have any other observations?

Note to other power searchers - prefix each search keyword with a plus symbol to bypass most query expansion processes.

9:02 pm on Oct 18, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
votes: 0

Ignored words

That happens to me partuclarly in longer technical searches. I often need to remember the + sign to get the word "bug" or "problem" included in the result set. Without the +, that key term can sometimes be ignored even when I make it the first word.

4:51 pm on Nov 5, 2008 (gmt 0)

Senior Member

joined:Jan 27, 2003
votes: 0

Just as a footnote to this (as I seem to get asked a fair bit) query expansion is also one of the most common reasons for the message "these terms only appear in links pointing to this page" within the Google cache. Other than if there are actually links, of course.

The highlighting function Google uses (understandably) doesn't support expanded queries the obvious question being - what would it highlight?.

1:18 pm on Nov 9, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 21, 2006
votes: 0

I'd add one that I come across often ...and get irritated by it so much that I even give up sometimes.

Google seems to dilute results for search phrases between quotation marks with matches that clearly should not be there. Not only when there is NO instance for such a phrase in their index ( e.g. they didn't find not one "widgety widgeteering widget" ) but also when they want to ignore the phrase or just a word from it.

gets on my nerves every time, especially because these are - same as tedster's example -, usually tech (support) related queries.



Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members