Welcome to WebmasterWorld Guest from 54.167.202.184

Forum Moderators: open

Message Too Old, No Replies

Stemming

is google implementing stemming?

     
11:53 am on Oct 14, 2003 (gmt 0)

New User

10+ Year Member

joined:Oct 11, 2003
posts:39
votes: 0


Hi!
New here. I'm sure this topic must have been discussed long before here, but may be you people won't mind replying again. Is google implementing stemming or thesaurus for keywords searched for? It's been a topic of debate off late at various forums but no one seems to be knowing for sure. What do you guys think?
12:07 pm on Oct 14, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 14, 2002
posts:207
votes: 0


Welcome to the forum Napoleon. I must say you do know the questions to ask to wake me up in the morning better than coffee.

You may know about Google's relatively new ~ syntax, which allows you to search for synonyms. To get an idea of how much ground it covers, search for blue and then ~blue; you will definitely get different result counts.

As for stemming, I miss it less than I thought I would since full-word wildcards are available...

12:20 pm on Oct 14, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Mar 13, 2003
posts:630
votes: 0


search for blue and then ~blue; you will definitely get different result counts

Hm, can't see a difference.
And the counter is random number generator since a few weeks, no matter what you search.

since full-word wildcards are available...

Could you explain this a bit further?
12:24 pm on Oct 14, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:May 14, 2002
posts:1192
votes: 0


Welcome to WebmasterWorld, Napoleon!

I see no evidence that Google have changed their policy on stemming [google.com]:

To provide the most accurate results, Google does not use "stemming" or support "wildcard" searches. In other words, Google searches for exactly the words that you enter in the search box. Searching for "book" or "book*" will not yield "books" or "bookstore". If in doubt, try both forms: "airline" and "airlines," for instance.
12:31 pm on Oct 14, 2003 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 24, 2002
posts:1130
votes: 0


Another new google message? "By default, Google searches for variations of your search terms." [webmasterworld.com]
This is at least an indication they are working on something.
12:33 pm on Oct 14, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 14, 2002
posts:207
votes: 0


Blue vs. ~Blue, that's funny. I definitely saw different result counts AND different order of results between the two searches. Try rose and ~rose for a dramatic count difference.

Full-word wildcards: Google doesn't support stemming, where you can stick a * at the end of a word and get variants on that word -- moon* finding moonlight, moondance, mooning, etc. But Google DOES support full-word wildcards, where you can substitute * for a word. For example, searching Google for "three * mice" finds three blind mice, three blue mice, three green mice, etc.

Make sense?

1:07 pm on Oct 14, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Sept 12, 2002
posts:252
votes: 0


Try rose and ~rose for a dramatic count difference

Or try ~flowers -flowers and see which words are being highlighted

1:09 pm on Oct 14, 2003 (gmt 0)

Senior Member from ZA 

WebmasterWorld Senior Member 10+ Year Member

joined:July 15, 2002
posts:1721
votes: 4


Blue vs. ~Blue, that's funny. I definitely saw different result counts AND different order of results between the two searches.

Your not going mad I saw different results to :)

1:16 pm on Oct 14, 2003 (gmt 0)

Senior Member from MT 

WebmasterWorld Senior Member 10+ Year Member

joined:Apr 1, 2003
posts:1843
votes: 0


Actually, what you see as "full-word wildcards" has nothing to do with wildcards, but happens to have the same end effect. Basically google simply removes characters such as "*" and certain stop words from your query. It still recognises them for proximity ranking though. So
Three * Mice
becomes
Three [any word] Mice

which of course has the desired effect.

But it has nothing to do with any wildcard feature. In fact a search for
Three * Mice
yields the same results as a search for
Three and Mice
or
Three a Mice

Just nitpicking though, because in the end it's the effect that counts.

SN

2:16 pm on Oct 14, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 14, 2002
posts:422
votes: 0


There are some strange things afoot at google. If you have an adwords account look at the broad-matching keywords and try some searches. (my pet conspiracy theory)
1:59 pm on Oct 15, 2003 (gmt 0)

Full Member

10+ Year Member

joined:Mar 14, 2002
posts:207
votes: 0


>
Actually, what you see as "full-word wildcards" has nothing to do with wildcards, but happens to have the same end effect. Basically google simply removes characters such as "*" and certain stop words from your query.
>

Hi Killroy,

Google is not just removing them. If Google was just removing them then the searches "three * mice" and "three * * mice" would get the same results, and they don't. There's some kind of placeholding going on, whether you want to call it wildcard or something else.

A search for "three * mice" (note quotes) does NOT give the same results for "three and mice", as Google doesn't recognize many (any? Maybe "the"?) stopwords in phrases.

RBuzz

2:41 pm on Oct 15, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 23, 2002
posts:165
votes: 0


RBuzz is spot on. I don't know what use it is, but it is fun to play with!

Right, now I am going to do some work.

HP

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members