Forum Moderators: open

Message Too Old, No Replies

Update Brandy Part 3

         

GoogleGuy

7:41 pm on Feb 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Continued From: [webmasterworld.com...]

"Any clue as to the possible role greater reliance on semantics is playing in your never ending quest for more relevant results?"

I'd say that's inevitable over time. The goal of a good search engine should be both to understand what a document is really about, and to understand (from a very short query) what a user really wants. And then match those things as well as possible. :) Better semantic understanding helps with both those prerequisites and makes the matching easier.

So a good example is stemming. Stemming is basically SEO-neutral, because spammers can create doorway pages with word variants almost as easily as they can to optimize for a single phrase (maybe it's a bit harder to fake realistic doorways now, come to think of it). But webmasters who never think about search engines don't bother to include word variants--they just write whatever natural text they would normally write. Stemming allows us to pull in more good documents that are near-matches. The example I like is [cert advisory]. We can give more weight to www.cert.org/advisories/ because the page has both "advisory" and "advisories" on the page, and "advisories" in the url. Standard stemming isn't necessarily a win for quality, so we took a while and found a way to do it better.

So yes, I think semantics and document/query understanding will be more important in the future. pavlin, I hope that partly answers the second of the two questions that you posted way up near the start of this thread. If not, please ask it again in case I didn't understand it correctly the first time. :)

makemetop

11:55 am on Feb 16, 2004 (gmt 0)



>I bet the two keyword searches are more relevent to buying or renting.

Want to bet? I happen to do a lot of stuff in the car hire area and can show you loads of destinations where the top 10 sites have nothing to do with renting or buying!

sovidiu

11:56 am on Feb 16, 2004 (gmt 0)

10+ Year Member



Teshka said:
"Does this mean that Google's vision of the future Internet is as a massive online encyclopaedia with little commercial content? "

I don't really think so. That would be DMOZ, and not Google. As far as we've seen on Google Romania (who accidentally has some inversed links that sort of ruins Google's credibility in our country), they did not update the serps as previously done on each Friday. Nor did the PR change, as done on each Wednesday. And as far as we can see, Google only display results into the first three serps. If you quickly browse the first three serps, you'll see Google doing another search for you term. Results on Google are very easy to handle, when being backed-up by a traffic legion. And what's with that error we're getting? (e.g., link a web page to a web site that has PR 6 for example and is located on the same web server you're on. Sooner or later, you'll get its PR on your web page). Further more, how come Google gives carte blanche for a randomly-generated forum web pages? For instance, we freely receive a PR 1 on each randomly-generated web page, while webmasterworld gets 3. Is that some sort of "trust vote"?
Thanks.

BeeDeeDubbleU

12:05 pm on Feb 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



You mean that you don't see commercial results in DMOZ?

djgreg

12:07 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



It does not explain what hapened to my site, that conteins the keyword in the url, in the site's name and on all of the backward links.

Well Pavlin, maybe you have gone too far by trading links with the same anchortext every time?

In the area my business is in, the 64 results are very very good and also in every search I needed to doon 64 not regarding my business I found relevant information in the TOP 10 results.

So for me the 64 index is absolutely great!

Hissingsid

12:13 pm on Feb 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

I went out for a very nice meal last night, started late this morning and find that GoogleGuy has confirmed all the stuff we have been belly aching about for three months.

SteveB has it exactly right IMHO re "qulity signals". But what does that mean. Well if you boil it down to its pure essense it means that Google can understand what the page and to some extent site is about if you blank out the term searched for. Just try it on your pages. Print out the source and take a thick felt tip. Score out all of the HTML then score out all of the words that are in your top term. Do you still know what its about? How does it compare with the top three in SERPs for that term whan you do the same to their pages?

City searches are particularly difficult here because very often there are no synonyms or stems for the city name. You need to look for what the top sites have as triggers. Build those things into your page, have links to pages on those terms using that term in the anchor text. Google doesn't and can't assess quality subjectively although quality is a subjective measure. It therefore measures objective things that approximate to a subjective assessment.

Everyone here who is interested in this stuff should go and read the thread started by Marin about Latent Semantic Indexing. Read the paper that he cites and try and find the white paper on CIRCA. The penny will drop.

I'm certain that Steveb has not implemented this in a contrived way his ite is just so full of large pages of rich language around his subject. He has achieved high ranking by doing what comes natural to hime. For those of us who need to make a change to break old habits and give the Google algo what it is looking for there are ways to do so. Its metaphorically like following a diet, you just need to learn the basics of what to do and stick to it.

If you want to find what Google has in its Ontology (if you don't know what one is do this search define:ontology) then do a search like this ~widgets -widgets and not the words that are bold in the reults (if you have prefs set to 100 you can quickly scan the results. Then feed these words back in to create a map of associated words. Search for the term and look what the top three pages use in terms of associated terms and wher they use them. Now use this new vocabulary you have learned to broaden the language in your pages and in your site. Pretty soon we'll all be doing what Steveb does naturally.

Best wishes

Phil

PS The roast Partridge was excelent

AthlonInside

12:13 pm on Feb 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I can't notice even some slight changes to the 64 SERPs, anyone? some flux? for the last few days?

pavlin

12:25 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



Hissingsid, you are absolutely right. But this set of rules can help only in some industries/themes.
I guess you have read the LSI paper and it's clear, that this way of handling sites work only when the search engine has a set of semanticaly connected words. I still think it's a dangerous AI game.
Anyway now I'm going to make my sites "prefered" - get rid of some of the content and add some booble link pages. I have a directory page with no content - just links, that is performing great, so for me thats the way to go.

SlyOldDog

12:31 pm on Feb 16, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks GoogleGuy for being so forthcoming. You know we watch your words as carefully as Alan Greenspan's ;)

I just have one question which nobody has ever adequately answered for me. It seems to be generally accepted practice among SEOs to have a links page and perform link swaps. These are necessary to rank well in competitive areas especially for commercial sites that don't receive many natural links, and where everybody who is anybody has a (user indifferent) links page. Probably on topic - but about as targeted as a double barrel shotgun.

Of course the whole concept is rediculous, and users would never think of reading most of the links pages on these sites. The Google Guidelines even prohibit artificial linking to deceive the search engine algorithm.

On the other hand. For many competitive areas, no links page means no good ranking, and so far Google has tolerated sites which have these pages. So what is your take on this activity?

pavlin

1:25 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



Hm,
There is another important question - What happens to the pages, that SHOULD NOT be handled by the new semantic algo. I think this is the key to MIA sites and this is the reason for all of the OOP rumours.

So if you have a page that is non-english but use english word as a keyword in the url an site's name or is so closely tied with a topic, with the LSI you are in trouble.
It seams the new algo is based realy on those dictionaries of close words and G is expecting that if your site is dedicated to topic "kw1" it have to say something about "kw2", "kw3", "kw4" and so on. If not - this is a SPAM, the algo asumes.
So if you have done good optimisation for kw1, but do not use the rest of the kw's, you end up "penalised".

The problem is G is using this algo everyhere even if it knows that the pages are non-english.
I guess thats the problem with my MIA page - it is non- english language, but uses an english word in it's title and as a main kw. But when G sees this kw (kw1) it expects to see the other kw's in it's dictionary. And when they do not show up, it thinks the site is spam. The truth is that the rest of the kw's are there (I did some "~kw1" testing and know what my synonyms are), but are writen in other language and even in other alphabet (cyr). (It will be pain for the users if I go and use all of the english words.)

I guess thats the same with the sites, that are so closely on some subject, that do not include the other words. And that's why there are so many portal sites on top - their directory listings contain links with desriptions that use almost every kw that G expects to see.

So the question I ased untill now is wrong. It's now what is happening to the sites that the alglo fails to understand. The answer is - they are getting handled the old way (pre-Austin).

! -> So it will be nice if G stops to aply the semantic algo on the sites it knows are non-english!

As for the english sites - Hissingsid is absolutely right - do some "~KW1" testing and try to use as much of the other kw's that come up.
Also - make those kw's links to some hight relevant sites.

It's not what people think is relevant any more. It's what the machine (in this case G) thinks is relevant. Obey and God help us all!

[edited by: pavlin at 1:55 pm (utc) on Feb. 16, 2004]

itwasntme

1:37 pm on Feb 16, 2004 (gmt 0)



Sorry for interrupting the discussion, but there's still lots of spam in the SERPs. Is it ok to file a spam-report and mention "brandyupdate GoogleGuy mynick"? I just tried it and I'm curious if it works.

tambiz

1:53 pm on Feb 16, 2004 (gmt 0)



I'm seeing 64 results on google dot com this morning.

Does that mean that it's over? If it is, my site is toast!

customdy

1:57 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



still seeing 216 on www in the US (East Coast). Have not seen 64 on www yet...

sloney

2:02 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



I saw 64 on co.uk about 2hours ago for about 3 minutes(it was only on one or two datacentres and I had to refresh a few times)and then it was gone! Aol.co.uk is still showing 64 - results are good here. Thanks everyone for keeping us up to date with this.

Ledfish

2:07 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



customdy

Not seeing here in the Great Lakes area (Ohio) either. 216 and 64 are not even close so I figure it must not be done cooking yet. At least I hope since the results are so dramatically different.

Kennyh

2:07 pm on Feb 16, 2004 (gmt 0)

10+ Year Member



sloney - are you sure aol.co.uk is showing 64 results? They look identical to 216 (and different to 64) for the keywords I'm checking.
This 327 message thread spans 22 pages: 327