Page is a not externally linkable
- Google
-- Google News Archive
---- Google's Florida Update - a fresh look


claus - 4:30 pm on Dec 19, 2003 (gmt 0)


merlin30:
The assumption that a search engine must make about any page it finds is that the page contains mostly nonsense and is of little value - until *reliable* evidence suggests otherwise.
For some odd subsets of pages this seems like an okay asumption, but across the whole 4 billion page set, i'd say it was the reverse: The page in question generally has value, you just have to figure out for what purposes that page has said value.

>> Google doesn't yet know that a Keyboard Gift is a type of Gift.
>> So it doesn't highlight Gifts and Gifts. Why doesn't it know?

Now, that's an interesting one, and it does shed some light on what's really happening. The word "gift" is not just a gift, you see. Here are five queries with the last two in Danish, none are so specific that they can harm or benefit any members, so i think they're okay to post:

1) christmas gift
- identifies topic of "gifts", stemming or broad match occurs.

2) birthday gift
- identifies topic of "gifts", stemming or broad match occurs.

3) keyboard gift
- does not identify topic of "gifts", only singular "gift" is matched/highlighted.

4) blev gift (Danish for "were married" as in "they were married")
- does not identify topic of "gifts", only singular "gift" is matched/highlighted.

5) rotte gift (Danish for "rat poison")
- does not identify topic of "gifts", only singular "gift" is matched/highlighted.

So, clearly there is some kind of ruleset that decides that if "gift" is used nearby "christmas" or "birthday", then it's a search on the topic of gifts, and both the singular and plural versions are matched. If "keyboard" was a common occasion for gift-giving the stemming would occur here too, but it isn't so it doesn't.

As a lot of words have more than one meaning (except for nonsense-words) it does not make sense to focus exclusively on one particular sense of the word, unless you are confident that this sense of the word is the intended one. For "gift" it seems that "christmas" or "birthday" are two such helper words that makes the sense (or topic) of the word "gift" apparent - if none such helper words are found, the query is ambiguous.

/claus


Added: yup, "married" and "poison" really are the same word in Danish, don't say Danes doesn't have a sense of humor (humour, even)
Edit: replaced "most words have more than one meaning" with "a lot of words have more than one meaning" as i don't even know all words, much less their meaning.

[edited by: claus at 5:58 pm (utc) on Dec. 19, 2003]


Thread source:: http://www.webmasterworld.com/google_archive/20566.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com