Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Google Word Variations (Stemming)

Since the advancement of stemming google is less relevent

         

RichTC

2:02 pm on Jul 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is it just me or has anyone else noticed the decline in the SERPS following the roll out of
Word variations (stemming)in Google where it treats one word the same as another?.

I could give zillions of examples of this where Google is completely off the mark but cant list them here due to WebmasterWorld policy so will have to try and use the widget example.

Lets say your detailed authority site has dedicated pages about "Blue Widgets" and also "Red Widgets" and you also have pages about "Blue-black Widgetvillers" and "Red-white Widgetvillers".

Now then, google has decided via its stemming information (probably due to webmasters buying adwords for both "Blue Widgets" and "Blue-black Widgetvillers") that these terms are associated when in fact they represent different terms (im not posting about plurals here, but different words altogether that G thinks are related).

When the search user types "Blue Black widgetvillers" into the search, google no longer returns the dedicated pages about this search as it used to, but it now delivers ANY page what so ever on the net from blue widgets to bluish widgets to bluey widgots to blue black widgetvillers for the search.

So, an authority site may find google listing its page about the blue widget rather than its dedicated page about the blue-black widgetvillers which the search user was looking for.

Moreover, a webmaster quickly finds that their dedicated pages may not rank for other search terms as thier dedicated "blue-black widgetvillers page is being returned by google for a search by another user for "blue widgots"

In all, i believe google has introduced this in order to try to prevent webmasters being able to optimise sites for multipal keywords in an attempt to push up adwords purchase and at the same time give the search user less precise results in the hope they will turn more to sponsored adverts where another webmaster may be bidding for the term "Blue Widgots" but using the "Blue-black Widgetvillers" title on their listing.

IMO stemming doesnt work or produce quailty serps match to the keyword string, its only any good for plurals - outside of this google is trying to run before it can walk and thats why some search results are plain garbage.

To quote Google:-
"Google uses Stemming technology. Thus, when appropriate, it will search not only for your search terms, but also for words that are similar to some or all of those terms. If you search for pet lemur dietary needs, Google will also search for pet lemur diet needs, and other related variations of your terms. Any variants of your terms that were searched for will be highlighted in the snippet of text accompanying each result.

[google.com...]

Problem is they havent stopped at mild variants!

[edited by: engine at 2:55 pm (utc) on July 4, 2006]
[edit reason] added link to the stemming section [/edit]

tedster

8:47 pm on Jul 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I agree that stemming and semantics in general are not yet what they aim to be -- but there is a decent amount of targeted traffic coming to some of my clients because of it, too. Search phrases that include the noun version of a word may also return pages with the verb form, or the adjective form, and this is sometimes generating excellent traffic. But I also see some really whacky stuff in the search results.

Still, in some niches, I think spam is a much bigger problem than poor relevancy due to stemming. And I also think that the semantics portions of the algo are continuing to improve. It's a work in progress, you know? Here's a case where I am very willing to use that "Dissatisfied" link at the bottom of a SERP. I think intelligent feedback on this issue can only help Google improve.

RichTC

4:27 pm on Jul 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



tedster,

Im at a point now where i think google has lost all relevence where it continues to drive search using semantics.

It feels like Google picks the worst page on your site that may have a word variation on it and lists that one in the serps rather than your dedicated page about the subject!.

Ive also noticed that google gives weight to a page with just a link on it about the subject matter - i think they have this dial turned far to high!

As a regular user of google i have to say i hate the serps results now its a real mix up with next to no relevancy - i just dont get it!

tedster

4:53 pm on Jul 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In some searches I see the same kind of thing - reminds me of the old stupid results on AltaVista at times. The relevance of some top ten pages is mind-blowingly poor.

soapystar

5:31 pm on Jul 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



yes. I believe it will always lead to lower quality results because of the way people search. When people do a search it’s very targeted. There isn’t really a need to second guess what they are looking for since if the first search is poor the user will adjust the term themselves and that will always involve less random and vague terms than using general related terms. Related terms will always be broader than the actual specific subject and so targeted serps are always going to be diluted with off topic results. And that’s when they work. In many cases the results are bringing up totally unrelated results. This all seems part of the search for AI. Even though it remains debatable if its actually possible the will exists in all engines to pretend its inevitable that searches will be done by AI and to place all guessed elements of it into the current searches. Semantics being one of them. Google being better than the rest (imho) is of course more advanced in its use but returning poorer results most of the time for it. Yes you will always find examples where it picks out something useful you hadn’t been aware of or hadn’t thought to search for but….in few enough cases to make this only really useful as an added button to broaden a search. Of course its also part of the battle against spam. Or though we are led to believe. Actually its pretty pointless trying to filter spam by looking for broad matches on site if you are going to let them into the serps because they have 20,000 inbound links from cloaked web design controlled sites anyway.