Forum Moderators: Robert Charlton & goodroi
Interesting that you point this out; I've experienced similar frustration of recent.
I wonder, probably naively, if users are now sufficiently discerning/sophisticated in their use of common operators when searching that Google feels it worthwhile or necessary to return results accordingly...
Syzygy
I'm with Borat!
(I know adding the question mark doesn't make it a question, but it has made you Borat for a while.)
In general, I've found that Google likes exact matches on the page... just not too many of them... but there are several hundred other variables. I can imagine various off-page/on-page scenarios that might cause a page containing the three-words separated on the page to rank higher... probably less likely to happen as the three word phrase is more competitive and purposefully targeted by others.
Please paint a fuller picture of what you have in mind, and we can work on the title as the question becomes clearer. ;)
[edited by: Robert_Charlton at 1:14 am (utc) on July 31, 2007]
About a year or so ago(hmm what happened then), you could put in "fussy yellow blue widgets" and come up with the 4 - 6 pages directly addressing the subject.
Of course, they were low PR as how many people are actually linking to those types of pages.
Nowadays you find those pages in the supplemental or only with quotes.
--- and you dare not type "widgets yellow fussy blue" and expect to find it.
Now, higher PR or Trusted pages get credit for having those 3-4 words anywhere on the page.
Assuming Goog did this on purpose, one can only guess that most users still don't search with more than 2 words or it doesn't mind the users clicking on the adwords to find a similar result rather than digging to page 60.
I suppose a low "exact phrase" match could be a counter-spam measure, I did see a recent comment by Matt C that they'd lifted it when a topical phrase brought bad results, just seems it's normally set a bit too low.
This is more true on Google than on the other engines. (You might say that MSN, eg, still doesn't get the Apathy Club joke).
I suppose a low "exact phrase" match could be a counter-spam measure, I did see a recent comment by Matt C that they'd lifted it when a topical phrase brought bad results, just seems it's normally set a bit too low.
The longer the phrase, I feel, the more unnatural it is for it to occur very many times as an exact match on the same page. And some of us still don't like to use a keyword more than once in a title. But I have found that an occasional exact match on the page can be very helpful.
I'll try to un-Boratize the title. Please let me know if I get it right.
Many thanks to Bill Slawski for his insightful commentary on this Google patent [seobythesea.com]. His article is worthwhile reading for those who are following this kind of thing.
There are other related Google technologies in this area. For example, see our earlier thread about the six phrase based indexing patents [webmasterworld.com]. But one key take-away is to recognize that all search "phrases" are not created equal. Google continues to work at understanding when groups of query words make up a true semantic unit, and when they are just multiple query words.
From the patent's "Description" section:
Assume that a user enters the search terms "baldur's gate download." The user intends for this query to return web pages that are relevant to the user's intention of downloading the computer game called "baldur's gate." Although "baldur's gate" includes two words, the two words together form a single semantically meaningful unit. If the search engine is able to recognize "baldur's gate" as a single semantic unit, called a compound herein, the search engine is more likely to return the web pages desired by the user.
Now this patent was applied for seven years ago, and there's no reason to assume it's in use today, exactly as described. But it can help us appreciate what Google considers important, and at least one methodology they have considered seriously enough to apply for a patent.
One of my pages relates to a perfectly valid search term FOO BAR ROO (say). Once-upon-a-time it used to be #1 in serps for a search on FOO BAR ROO, and used to get traffic.
However the page is now supplemental. Now the same search shows a Wikipedia page on that subject as #1, followed by a mish mash of unrelated pages with the keywords on the page, and then my page well down in the serps.
But searching in quotes "FOO BAR ROO" brings up my page as #1.
As the page is technically no different from my other pages (links, format, likely PR, etc.), I suspect that one of the reasons it is supplemental is that Google does not recognize FOO BAR ROO as a semantic term, and sees it as a poor performer compared to other FOO or BAR or ROO pages.
It seems therefore that this page is probably never going to get out of supplemental status. And as only a very small proportion of searchers use (or even know about) sophisticated search methods, the page will hardly ever get any traffic.
Across all my sites about 50% of pages are supplemental. So if 'semantic unit' technology is one of the reasons, it gives me a major headache.
I mean: google also offers this translation tool. I never used it very much, but I think it gives quite reasonable results, which means google's software is capable of identifying structures far deeper:
"Syntactic structures" (Chomsky, 1957) are so fundamental to (computer-) grammars, and offer a lot of analytical potentials: I'd suspect many of the examples of OP's type are easily explained by e.g. some of the basic rules or constraints in transformational grammar [en.wikipedia.org]. However, the devil lies in the details here, and it is hard to discuss such assumptions without violating the TOS.
Yet, I nowhere found any hints google would rely on any such syntactical analysis in its ranking algos, but I'd really be surprised if this was so.