Forum Moderators: open

Message Too Old, No Replies

Stemming Appears To Be A Factor

Even though Google says it isn't

         

espeed

9:12 pm on Apr 29, 2003 (gmt 0)

10+ Year Member



Google says that it doesn't use stemming:

To provide the most accurate results, Google does not use "stemming" or support "wildcard" searches. In other words, Google searches for exactly the words that you enter in the search box. Searching for "googl" or "googl*" will not yield "googler" or "googlin". If in doubt, try both forms: "airline" and "airlines," for instance. -- [google.com...]

However, it appears that word-variations still add weight for a given phrase.

You can notice this when you search for a phrase, 2.3 million results are returned, the #1 result only has the root phrase once (in the meta description), but a variation of the phrase is repeated numerous times throughout the page. If stemming wasn't coming in to play, then the lower result pages that have a higher PageRank and higher keyword density for the root phrase would be returned above it. NOTE: I could not find any link text with the related phrase pointing back to the page.

Thoughts?

brotherhood of LAN

9:21 pm on Apr 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



When the -link: tool used to work on google so did stemming. You could search for airlin* and get sites matching with airline.

Since the -link: thing stopped I can't seem to do the same searches.

Searching for widge*... [google.com...]

Searching for webmasterwo*....
[google.com...]

espeed

7:33 pm on Apr 30, 2003 (gmt 0)

10+ Year Member



I understand that Google doesn't permit you to search for words with wildcards, but it appears that Google is factoring word variations into the ranking algo.

taxpod

7:39 pm on Apr 30, 2003 (gmt 0)

10+ Year Member



>>When the -link: tool used to work

Is the "-link: tool" different from the "link: tool" - what was its function?

heini

7:39 pm on Apr 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>NOTE: I could not find any link text with the related phrase pointing back to the page.

espeed, just to make sure: you did look at all links, not just the one Google gives you?

<added>taxpod, look for easter egg :)

espeed

5:19 pm on May 1, 2003 (gmt 0)

10+ Year Member




> NOTE: I could not find any link text with the related phrase
> pointing back to the page.

> espeed, just to make sure: you did look at all links, not just
> the one Google gives you?

As many as I could find. Does anyone else have data on to support or deny this theory?

swerve

6:01 pm on May 1, 2003 (gmt 0)

10+ Year Member



As many as I could find. Does anyone else have data on to support or deny this theory?

Here are some stats on one of my sites, which seem to support the "no stemming" claim:

- ranked #1 for a query in the form "blue widgets" (second word pluralized) (of ~60,000 results)

-ranked #161 for the singular form "blue widget" (of ~80,000 results)

The plural form ("widgets") appears in the site title, meta description, and body. The singular form ("widget") appears in meta description and body.

RawAlex

6:10 pm on May 1, 2003 (gmt 0)

10+ Year Member



Also, don't forget that anchor text on incoming links is important. A site with a single instance of a phrase in title and headline, example, with 100 good inbound links with the same phrase SHOULD do better than a spam page with the phrase repeated 100 times but few good inbound links.

Because SERPS and PR are complicated things, it is very poor science to try to extract a "does or doesn't" from the results to prove a single point. You don't have anywhere near all the data, so it is hard to tell which of many factors is causing the problem.

Remember, most people involved in auto accidents also ate a meal within 4 hours of driving their cars. Therefore, eating could be a cause of auto accidents! :-)

Alex