Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

does search word order matter?

         

surfergirl

12:32 am on Jan 20, 2007 (gmt 0)

10+ Year Member



Just spoke with someone about a site that is in top 10 returns when searching with a three word set. If you reverse the order, the site is noway near the top 10 returns. I can't see why this is..does google put any weight on the order of the search terms?

Example: search using the words:
kwd1 kwd2 kwd3

gives different results than
kwd3 kwd1 kwd2

Does google filter using first word, then second, then third?

I can't find any on-line documentation to support why this occurs. Anyone know of any? What causes this?

-thanks.

<Sorry, no specific keywords.
See Forum Charter [webmasterworld.com]>

[edited by: tedster at 2:58 pm (utc) on Jan. 20, 2007]

dickbaker

11:30 pm on Jan 20, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Google is looking for the keyword phrases that most accurately reflect what the searcher has typed in. That's a good reason to alter the order of your keywords in the title tag and in the page text. For example, the title could read, "Acme Blue Widgets--Blue Acme Widgets."

MThiessen

5:21 am on Jan 21, 2007 (gmt 0)

10+ Year Member



I agree search daily horoscopes, then try horoscopes daily as an example. You will see different results. (no quotes). strange but I have noticed it too.

inbound

6:17 am on Jan 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Word order is a keystone of any decent relevance algorithm. Google does not treat each word separately, that would throw away a lot of data from the search string.

Matching words group (or n-grams) is quite complicated in practice (when you are working with billions of documents), but simple enough to grasp. Take this simplified example:

Search Terms-
A bakers dozen
A dozen bakers

2 documents to search-
The number thirteen is said to be a bakers dozen.
The worlds largest cake took a dozen bakers to make.

Both documents have the same number of matches if you count the words separately. But they talk about very different things (context). If you look for pairs of words (in the first search - "a bakers" and "bakers dozen") or all 3 in order, you can give higher scoring to these matches. In practice you would also want to alter the importance of a phrase by looking at the "informational value" of it, words that appear very often are often of less "value" than ones which are less common (hence "bakers dozen" has more informational value than "a bakers")

Google has made huge amounts of n-gram data publically available (for the linguistics community). There is no doubt that mind-boggling amounts of processing goes on at Google in this area.

Lots of characteristics can be inferred by the order of words. Decisions on the subject of a document can be made without having to "understand" a document, by comparing the frequency of n-grams from the document to a much larger dataset (the web). It's a great way to decide which phrases are "important" on a page that has AdSense on it.

Google makes excellent use of the data it collects from many areas; AdWords is another fine example. Millions of adverts and phrases are grouped together by hand, by advertisers. The reliability of this data is very high, given people are paying for these adverts, hence Google can look at all of the Adverts that are supposed to show for one phrase and statistically predict which other phrases should be similar. Such data is great when looking at how to do 'broad match' etc. Google is better at collecting and manipulating textual data than Yahoo or MSN; hence they have a massive lead when it comes to textual Ad-Serving.

Regardless of how Google algorithms change (and hence favour you or not), you can be sure that a great deal of importance is put on word order, it's also such a huge task to pre-calculate n-gram statistics on a huge scale that you probably should only worry about 3 word combinations at the moment (remembering that 4 word combinations can be fairly well replicated by two 3 word combinations - e.g. "word1 word2 word3" AND "word2 word3 word4" is quite likely to give documents that have "word 1 word2 word3 word4").

Have fun.