Forum Moderators: open
"Any clue as to the possible role greater reliance on semantics is playing in your never ending quest for more relevant results?"
I'd say that's inevitable over time. The goal of a good search engine should be both to understand what a document is really about, and to understand (from a very short query) what a user really wants. And then match those things as well as possible. :) Better semantic understanding helps with both those prerequisites and makes the matching easier.
So a good example is stemming. Stemming is basically SEO-neutral, because spammers can create doorway pages with word variants almost as easily as they can to optimize for a single phrase (maybe it's a bit harder to fake realistic doorways now, come to think of it). But webmasters who never think about search engines don't bother to include word variants--they just write whatever natural text they would normally write. Stemming allows us to pull in more good documents that are near-matches. The example I like is [cert advisory]. We can give more weight to www.cert.org/advisories/ because the page has both "advisory" and "advisories" on the page, and "advisories" in the url. Standard stemming isn't necessarily a win for quality, so we took a while and found a way to do it better.
So yes, I think semantics and document/query understanding will be more important in the future. pavlin, I hope that partly answers the second of the two questions that you posted way up near the start of this thread. If not, please ask it again in case I didn't understand it correctly the first time. :)
Chicago, from looking at just one site as an example, phrase one - which is far more competitive - has many pages mentioning it prominently and is doing fine. Phrase two, however, is barely mentioned outside of one or two pages and and is doing poorly. Same optimization just about, the difference is in the amount of text on the pages and therefore the KWD - and the number of pages on the site for one or the other.
Inktomi loves the interior page about phrase two and likes the interior page for phrase one but does not care much for phrase one at all for the homepage - exactly the opposite of Google. Ink is looking at on-page factors, and I truly believe (gut level, nothing empirically provable) that if there were more pages on that site for phrase two it would do better with Google.
Both those phrases are the some for the second word - it's the first word that's the modifier. It's the second word that's really the important one.
I can grasp the concept with individual words, but I've been trying to wrap my head around the concept of IDF when it comes to phrases.
[edited by: Marcia at 5:30 am (utc) on Feb. 17, 2004]
Google SERPS just went completely bezerk! This cannot be right... what the heck is going on? Withing the last 10 minutes results have turned comptetely on their head!
You beat me to it, 'mytown widget' search was showing me very nicely placed. Now I am unable to locate my site at all.
[added]Can now find a deep listing for my site for this search on page 13, main index page is not in the SERPS at all and the keywords are very relevant to that page[/added]
Absolutely agree. This is just impossible to describe if you aren't seeing. Sites being totally lost one second, ranking forst the next dropping forty spots, rising to second, dropping to 400th... I kid you not.
This would be on www, www2 and www3... 64 is steady.
Seriuos Observation:
It seems as if inbound links from pages of the same domain count much more than before. I have a website with a menu showing on every site pointing to some subdomains. One of the menu entries is a quiet competitive search term but the site it points to was never optimised in anyway. Not the keyword in the title, only one time on the page. Now this sundomain ranks pretty high for the anchortext used in the menu although it has zero links from other domains. The page is pretty relevant for the keyword so that is not the problem, I was just wondering how come this site beats all those highly optimized sites for this keyword.
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.165 - - [15/Feb/2004:14:48:39 -0700] "GET /robots.txt HTTP/1.0" 200 25 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.165 - - [15/Feb/2004:14:48:39 -0700] "GET /bottom_widgets.html HTTP/1.0" 200 13148 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.159 - - [15/Feb/2004:15:15:28 -0700] "GET /mmobile_widgets.html HTTP/1.0" 200 12431 "-"
then
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.79 - - [15/Feb/2004:15:25:28 -0700] "GET / HTTP/1.0" 200 11178 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.33 - - [15/Feb/2004:16:09:54 -0700] "GET /about_widgets.html HTTP/1.0" 200 15562 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.135 - - [15/Feb/2004:16:26:34 -0700] "GET /stringed_widgets.html HTTP/1.0" 200 12809 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.136 - - [15/Feb/2004:16:29:25 -0700] "GET /robots.txt HTTP/1.0" 200 25 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.136 - - [15/Feb/2004:16:29:25 -0700] "GET /contact_widgets.html HTTP/1.0" 200 9892 "-"
then
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.46 - - [15/Feb/2004:18:11:42 -0700] "GET /top_widgets.html HTTP/1.0" 200 13044 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.44 - - [15/Feb/2004:18:11:46 -0700] "GET /widget_accessories.html HTTP/1.0" 200 8905 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.137 - - [15/Feb/2004:18:53:56 -0700] "GET /differentwidget.httml HTTP/1.0" 200 23414 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.135 - - [15/Feb/2004:19:26:46 -0700] "GET /attaching_widgets.html HTTP/1.0" 200 11341 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.137 - - [15/Feb/2004:19:54:34 -0700] "GET /connecting_widgets.html HTTP/1.0" 200 1330 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.165 - - [15/Feb/2004:20:13:04 -0700] "GET /widget2.html HTTP/1.0" 200 10785 "-"
then
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.208 - - [15/Feb/2004:20:48:18 -0700] "GET /widget7.html HTTP/1.0" 200 11649 "-"
then
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.33 - - [15/Feb/2004:20:48:24 -0700] "GET /widget4.html HTTP/1.0" 200 9249 "-"
then
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.164 - - [15/Feb/2004:21:26:46 -0700] "GET /single_widget.html HTTP/1.0" 200 23376 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.164 - - [15/Feb/2004:21:26:47 -0700] "GET /widget_convertible.html HTTP/1.0" 200 15877 "-"
"Googlebot/2.1 (+http://www.googlebot.com/bot.html)" 64.68.82.79 - - [15/Feb/2004:22:37:47 -0700] "GET /widget_Moving_brochure.pdf HTTP/1.0" 200 60358 "-"
then
I double-checked the IP addresses; they're all genuine.
Is this part of the Brandy update, w/ multiple 64.****.xx.xx
based Gbot's visiting-&-revisiting..?
o []
[edited by: a_chameleon at 7:42 am (utc) on Feb. 17, 2004]