Forum Moderators: open
My search for Ford Dealership {another Keyword} {another Keyword} returned every variation of every word including Ford's, Fords Dealers, Dealer, and on and on - each highlighted as though I had searched for it.
Is this new or have I just slept through something?
-s-
This lends itself to some interesting new opportunities, but IMHO can complicate things a little bit for those among us who are intent on optimizing pages for Inktomi.
With old google, when you searched for kw1 kw2 you found exactly what you were looking for. Now, this same search will return results about kw2 that have very little to do with kw1. It's amazes me how google claims they have improved their search engine, when reality is they made it much worse!
rfg, it's complicated because of the fact that with Ink there's a barebones simplicity of making pages that are very precisely targeted. With Google now you can be found by people looking for the alternate of a word - they can find what they're looking for either way when they search, whichever way they do it - in *some* cases.
If some pages are doing well with Ink, trying to vary them so they get found both ways with Google could mess up the balance just a little bit for the Ink listings. For some pages, anyway. It's a tomayto, tomahto thing. Some say it one way, some another but they're the same thing.
I've been getting Ink at Yahoo for several weeks and was looking at this very thing all afternoon, with two different forms of a certain item descriptor for the second word in two-word phrases and several different modifying words for the first word.
An interesting aspect is the phenomenon of the message we get that words are only found in "pages linking to this page," seeing what will happen with with different combinations of on-page and inbound link usage of the alternate words and phrases in varying proportions.
>>you can turn it off/do an exact search with + in front of the word, or by putting it in quotes.
There are some tremendous differences in numbers of pages returned, using the quotes or the +. For a two word phrase, using the + in front of each of the two words pulls in some very interesting difference.
I'm looking at one in particular. Normal "raw" search returns 4,590,000 pages - and the page in the #1 spot (not mine, btw) does not have that phrase on the page at all, nor in the site itself that I've seen because that "stuff" is not on that site at all, it's just a fancy doorway page.
Same search using the 2-word phrase in quotes returns 348,000 pages - with what's the #2 page in the first example coming up at #1 and not a sign of the page that was #1 in the first example.
Same search using +word1 +word2 brings back 4,620,000 pages returned with NO sign of the site that was #1 in the first, "raw" search above. Again, the site that was at #2 is #1 for this one, too.
In that search, it's word-2 that's the one that could be the variable, with either of two forms of the word. The site that's #2 and moved to #1 cannot rank for form #2 because the alternate form is not used, particularly not in anchor text, not even once.
For that stuff using the variable for the second word, the site (without the stuff) also comes up at #1 but whoooaaa!
Looking at the cache for that #1 site for both:
These terms only appear in links pointing to this page:
And for the #2 - indented result for the same site, same search
These terms only appear in links pointing to this page:
There are close to 4 million pages returned for the second variation - and in neither case is it because of what's on the page or the site. It's been all anchor text all along and still is. It was only the second form for a while, seems anchor text was added to pull in the traffic from the first search mentioned also.
On the other hand, that page is at the very end of top ten with Yahoo/Ink - by virtue of the description in the Yahoo Directory listing. There's not a snowball's chance in Miami mid-summer that site will hit #1 with Yahoo/Ink on its own. It would take some mighty high-handed finagling of the Directory to accomplish that as it stands now.
[edited by: Marcia at 6:05 am (utc) on Dec. 26, 2003]
IITian, name variants are not really stemming (e.g. coaching <-> coach), but more like synonyms. There may be some conservative amount of that going on right now, but not very much right now.
By the way, I'm traveling again over the next day or two, so I might not pop in as often for a while..
I'd be very tempted to review the testing methodology.
>according to conventional informational retrieval wisdom, stemming would often be just a wash, but we've done some extra work
A pig with lipstick on is still a pig ;)
On a more constructive note, if you are looking for unbiased feedback I'd be tempted to ask the guys and gals over at Google Answers their opinion. Spent some time over there today, seems they are having use some real complex search syntax to earn their dollars.
There are so many words with different spellings in different parts of the world. If I am looking for a word using how it is used in my region of the world I preferr getting sites using the word exactly as I entered it in Google (as it used to be the case).
Bringing this technology to other languages than English will bring even much more pain than it is right now in some cases!
The advantage of Google is its ease of use. I don't think it is possible to train enough people to use + for their searches in order to find what they are looking for.