Forum Moderators: open

Message Too Old, No Replies

Google Says It Ignores Common Words, But...

... if so, why do the SERPs differ?

         

michael heraghty

2:00 pm on Jun 27, 2004 (gmt 0)

10+ Year Member



I'm not the first person to bring this up, but no-one replied to it the last time:

[webmasterworld.com...]

Yet, I think the subject is important. Here's the issue: Google claims that it ignores common words in searches. It certainly *used* to. But it doesn't seem to anymore.

For example, try a search on "widgets in placename" (without the quotation marks). Now try a search for "widget placename" (again, without the quotation marks).

Google will say: "in" is a very common word and was not included in your search.

If that's the case, then why are the SERPs so different for each of the searches?!

If the common word was ignored, the SERPs should be the same for both. As far as I recall, this used to be the case. Not anymore however.

Is Google misleading searchers?

Robert Charlton

5:39 am on Jun 29, 2004 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



If Google is filtering out certain words...

We assume they're not filtering them. Rather, because the words are so common and used so often, and are assumed not to affect meaning of queries, they're not indexed. It wouldn't make sense to index them and then filter them out (OK... no sarcastic comments about Florida here ;)).

Google, I'm told, keeps everything in RAM rather than on hard drives. Maybe, when memory prices come down enough, and processors get even faster, they'll include more of the stop words in their index.

In the meantime, what is going on to produce these differing results?

digitalv

5:55 am on Jun 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think you all should stop wasting your time trying to figure out Google - every time someone gets close they go and freakin change everything and we all have to figure out how to optimize all over again.

Just leave well enough alone :)

michael heraghty

11:02 am on Jun 29, 2004 (gmt 0)

10+ Year Member



Robert: very interesting.

My initial searches showed that * in * and * of * produced the same results, but as I search for more competitive keyphrases, I'm seeing lots of differences in the SERPs.

These differences are subtle, however -- usually indicating a reshuffle rather than dramatically different results, so the weight factor is probably small. However, it is significant when it comes to those competitive phrases (such as travel phrases, like the one you indicated).

However, I don't think it's limited to placenames, as the more I experiment, the more I see SERP differences for "popularkeyword1 in popularkeyword2" vs. "popularkeyword1 of popularkeyword2" (once again, without the quotation marks).

So here's what I now think. Proximity is still an on-page factor, and carries weight -- but the so-called stopwords *are* being indexed, and they carry a small weight too, which comes into play when the margins between results are tight (as in "money" searches).

By the way, I notice that Brett has described this thread on the homepage as follows:

Google Indexing Stop Words: This goes back to changes made when Google introduced stemming. Google is now indexing stop words - even though the message says it does not.

Brett, do you know something that we don't? Was the introduction of stemming (which was around the time of Florida, which would fit with what I've noticed) somehow related to the introduction of indexed stopwords?

kaled

1:51 pm on Jun 29, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Given that it is the intention of Google (et al) to determine meaning from search phrases, minor words like of, in, etc. should indeed be used. However, they are likely to be used in a grammatical way rather than as indexed words.

Google should explain this rather than stating that they have been ignored. I would suggest the following.

The following common words are used for grammar and context analysis only

Kaled.

This 34 message thread spans 2 pages: 34