homepage Welcome to WebmasterWorld Guest from 107.20.25.215
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Google's Query Expansion Capabilities: Observations
Receptional Andy




msg:3768809
 8:12 pm on Oct 18, 2008 (gmt 0)

What you search is not always what you get

I would consider myself a fairly precision searcher - I know exactly what I'm typing. Nonetheless, Google often searches for what they think I really meant, as opposed to what I actually entered a process sometimes known as query expansion.

Common examples of query expansion:

Word stemming

Google introduced word stemming at least five years ago [widgets] matches [widgets], [widgeting] and [widgeteering]. The keyword entered is reduced to a root or 'stem' ('widget' in the examples above) and words starting from the same stem can be matched.

Acronyms/initialisms

An [FAQ] is a set of [frequently asked questions]. Google has impressive mappings of acronyms and initialisms to the full phrase. I think this would be an interesting database if it was ever made available.

Mis-spellings and typos

If you make an obvious typo, Google can include sites that only use the correctly-spelled word (in addition to the "did you mean:" prompt). This is much more obvious with certain queries.

Less common examples of query expansion:

Synonyms

Synonyms may be too narrow a definition, since the search operator for synonyms (~) reveals words that seem to have been derived from co-occurrence data, and have very distinct meanings. Nonetheless, it seems to be possible for Google to expand your query to include related words.

Translations

I see reflections of this in the interesting search result translation [translate.google.com] service. In some instances, Google seems to translate search keywords into other languages and return results from that language. I haven't really pinned down the pattern as to which queries (and pages) get this treatment, but I've seen quite a few examples where non-English keywords match English pages.

Ignored words

I occasionally see searches where words appear to have been dropped completely from the query. It's possible that certain keywords might be deemed to lack significance, and can return results with those words omitted from the search. I've only seen a few examples that point directly to this behaviour.

Interestingly, not all content in the index get the query-expansion treatment. I've seen results that suggest a more wide-reaching characteristic of URLs likely to get fuzzy-matching, but that's probably for another day ;)

In most of the common cases, it seems clear that rewriting of the search query can occur, even if the expanded words are not:

  • within the on-page content
  • within links to the URL
  • or even in text adjacent to links to the URL

Of course, in many cases one of the above conditions is true, which can make finding true examples of query expansion much more difficult. In addition, many of the processes involved seem to be based on aggregated data from content within the index, or based on user search behaviour - which means that there is more or less useful data available depending on the popularity and frequency of occurrence of the search keyword.

Whether query expansion occurs also seems to be related to the entire search query - certain formulations are much more likely to trigger expansion that others. Possibly this has both linguistic (e.g. not expanding a word that is used as part of a common phrase) and statistical (e.g. based on user behaviour) aspects.

Does anyone know any other examples of Google's query expansion capabilities, or have any other observations?

Note to other power searchers - prefix each search keyword with a plus symbol to bypass most query expansion processes.

 

tedster




msg:3768827
 9:02 pm on Oct 18, 2008 (gmt 0)

Ignored words

That happens to me partuclarly in longer technical searches. I often need to remember the + sign to get the word "bug" or "problem" included in the result set. Without the +, that key term can sometimes be ignored even when I make it the first word.

Receptional Andy




msg:3780704
 4:51 pm on Nov 5, 2008 (gmt 0)

Just as a footnote to this (as I seem to get asked a fair bit) query expansion is also one of the most common reasons for the message "these terms only appear in links pointing to this page" within the Google cache. Other than if there are actually links, of course.

The highlighting function Google uses (understandably) doesn't support expanded queries the obvious question being - what would it highlight?.

Miamacs




msg:3782948
 1:18 pm on Nov 9, 2008 (gmt 0)

I'd add one that I come across often ...and get irritated by it so much that I even give up sometimes.

Google seems to dilute results for search phrases between quotation marks with matches that clearly should not be there. Not only when there is NO instance for such a phrase in their index ( e.g. they didn't find not one "widgety widgeteering widget" ) but also when they want to ignore the phrase or just a word from it.

gets on my nerves every time, especially because these are - same as tedster's example -, usually tech (support) related queries.

...

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved