homepage Welcome to WebmasterWorld Guest from 54.166.111.111
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google News Archive
Forum Library, Charter, Moderator: open

Google News Archive Forum

    
Stemming
is google implementing stemming?
napoleon bona part 2




msg:75959
 11:53 am on Oct 14, 2003 (gmt 0)

Hi!
New here. I'm sure this topic must have been discussed long before here, but may be you people won't mind replying again. Is google implementing stemming or thesaurus for keywords searched for? It's been a topic of debate off late at various forums but no one seems to be knowing for sure. What do you guys think?

 

RBuzz




msg:75960
 12:07 pm on Oct 14, 2003 (gmt 0)

Welcome to the forum Napoleon. I must say you do know the questions to ask to wake me up in the morning better than coffee.

You may know about Google's relatively new ~ syntax, which allows you to search for synonyms. To get an idea of how much ground it covers, search for blue and then ~blue; you will definitely get different result counts.

As for stemming, I miss it less than I thought I would since full-word wildcards are available...

plasma




msg:75961
 12:20 pm on Oct 14, 2003 (gmt 0)

search for blue and then ~blue; you will definitely get different result counts

Hm, can't see a difference.
And the counter is random number generator since a few weeks, no matter what you search.

since full-word wildcards are available...

Could you explain this a bit further?

Mohamed_E




msg:75962
 12:24 pm on Oct 14, 2003 (gmt 0)

Welcome to WebmasterWorld, Napoleon!

I see no evidence that Google have changed their policy on stemming [google.com]:

To provide the most accurate results, Google does not use "stemming" or support "wildcard" searches. In other words, Google searches for exactly the words that you enter in the search box. Searching for "book" or "book*" will not yield "books" or "bookstore". If in doubt, try both forms: "airline" and "airlines," for instance.

takagi




msg:75963
 12:31 pm on Oct 14, 2003 (gmt 0)

Another new google message? "By default, Google searches for variations of your search terms." [webmasterworld.com]
This is at least an indication they are working on something.

RBuzz




msg:75964
 12:33 pm on Oct 14, 2003 (gmt 0)

Blue vs. ~Blue, that's funny. I definitely saw different result counts AND different order of results between the two searches. Try rose and ~rose for a dramatic count difference.

Full-word wildcards: Google doesn't support stemming, where you can stick a * at the end of a word and get variants on that word -- moon* finding moonlight, moondance, mooning, etc. But Google DOES support full-word wildcards, where you can substitute * for a word. For example, searching Google for "three * mice" finds three blind mice, three blue mice, three green mice, etc.

Make sense?

Hagstrom




msg:75965
 1:07 pm on Oct 14, 2003 (gmt 0)

Try rose and ~rose for a dramatic count difference

Or try ~flowers -flowers and see which words are being highlighted

creative craig




msg:75966
 1:09 pm on Oct 14, 2003 (gmt 0)

Blue vs. ~Blue, that's funny. I definitely saw different result counts AND different order of results between the two searches.

Your not going mad I saw different results to :)

killroy




msg:75967
 1:16 pm on Oct 14, 2003 (gmt 0)

Actually, what you see as "full-word wildcards" has nothing to do with wildcards, but happens to have the same end effect. Basically google simply removes characters such as "*" and certain stop words from your query. It still recognises them for proximity ranking though. So
Three * Mice
becomes
Three [any word] Mice

which of course has the desired effect.

But it has nothing to do with any wildcard feature. In fact a search for
Three * Mice
yields the same results as a search for
Three and Mice
or
Three a Mice

Just nitpicking though, because in the end it's the effect that counts.

SN

shrirch




msg:75968
 2:16 pm on Oct 14, 2003 (gmt 0)

There are some strange things afoot at google. If you have an adwords account look at the broad-matching keywords and try some searches. (my pet conspiracy theory)

RBuzz




msg:75969
 1:59 pm on Oct 15, 2003 (gmt 0)

>
Actually, what you see as "full-word wildcards" has nothing to do with wildcards, but happens to have the same end effect. Basically google simply removes characters such as "*" and certain stop words from your query.
>

Hi Killroy,

Google is not just removing them. If Google was just removing them then the searches "three * mice" and "three * * mice" would get the same results, and they don't. There's some kind of placeholding going on, whether you want to call it wildcard or something else.

A search for "three * mice" (note quotes) does NOT give the same results for "three and mice", as Google doesn't recognize many (any? Maybe "the"?) stopwords in phrases.

RBuzz

HenryUK




msg:75970
 2:41 pm on Oct 15, 2003 (gmt 0)

RBuzz is spot on. I don't know what use it is, but it is fun to play with!

Right, now I am going to do some work.

HP

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google News Archive
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved