New - Keyword "Significance" in Webmaster Tools - Google Search and SEO forum at WebmasterWorld

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

New - Keyword "Significance" in Webmaster Tools

Reno

5:23 pm on Nov 12, 2009 (gmt 0)

For a long time GWT has returned a list of most common keywords, but now I see it is also presenting visual bars to show their importance.

Is that fairly new, or have I been asleep at the wheel? I like it a lot and thank the GWT team for this graphical presentation.

I especially take note of this line:

"These should reflect the subject matter of your site."

That one sentence speaks volumes...

...........................

Reno

6:33 pm on Nov 17, 2009 (gmt 0)

I can't help but think this line on the Keywords "Significance" page is there for some reason, though that reason may not be absolutely critical:

"These should reflect the subject matter of your site."

So all I can say is - given that sentence from Google - if I had totally unrelated words near the top of that list, and knowing my tendency to worry, I'd probably be concerned. That concern may not be justified, but it would still be there lurking in the background.

............................

Receptional Andy

8:32 pm on Nov 17, 2009 (gmt 0)

I don't disagree that there is potentially actionable information in the report, Reno, and it is worth paying some attention to the results. Things like repeated paragraphs of text across a site, for instance, can certainly harm on-page relevance.

The flipside is that if you have a "terms and conditions" link in the footer of a site, that isn't likely to be worth removing purely for making a Google report look nicer. The impact on relevance would be so minor as to make the task of removing it a waste of time.

Google even seem to have filtered some of the most common repeated words from the list themselves, just for this report.

ergophobe

11:54 pm on Nov 17, 2009 (gmt 0)

I was assuming that my code was not validating and so the googlebot was somehow seeing raw code, but in fact, there is only one validation error and that's that I accidentally had an "alt" attribute in an "a" element.

It's still massively bloated HTML that needs to go on a major diet, but I wouldn't expect img, href, alt and rel to be showing up as keywords.

I'll be curious once I clear out some cruft in the template and it gets recrawled whether the situation improves.

Natewood

11:58 am on Nov 23, 2009 (gmt 0)

Here's a thought.

What if Google is trying to classify websites in their entirety. So a website that is predominently about a topic, let's say wasps, is possibly more likely to be a better answer for queries about wasps. A general site about insects might not have as detailed info. It's like going to a general bug expert or a wasp specialist. Having a sitewide relevance would be a relatively clean metric, since it's hard to appear to be a specialist unless all of your content indicates so. Spammy websites that try to cover a lot of content will find it hard to be a specialist in any one of them. Could just be one additional metric. Just a thought.

Reno

2:22 pm on Nov 23, 2009 (gmt 0)

I admittedly have no hard-data proof of this, but I think of it in terms of "potency". If a site is entirely about blue widgets, then its blue widget potency would be 100%. If it's 3/4 about blue widgets and 1/4 about yellow bidgets, then the blue widget potency is 75%. If it's about half blue widgets and 1/4 yellow bidgets and 1/4 red yidgets, then its blue widget potency is 50%... etc etc.

No doubt that is oversimplified and there are lots of other variables, but you get the idea. So if the "potency" is 100%, then it may very well imply more of a "specialist" or "authority". I'd be curious if any one has any data to support this supposition.

................

Natewood

3:02 pm on Nov 23, 2009 (gmt 0)

Admittedly, the relative scoring gives us no inidication of what might be required to be classified as an expert, and there would need to be a lower level cutoff, but for Google I think this the logic holds true as a fairly good additional signal to identify specialist sources. Combinations of keywords can be applied, so if "wasps" is a strong word, but "removal" is not, this indicates that your site would not be a good answer for pest control queries, compared to some other dude's site that's got multiple pages about wasp removal and control.

Hissingsid

10:49 am on Dec 31, 2009 (gmt 0)

Has anyone written a tool that simulates what Google is doing here? I would like to analyse some of my competitors sites to see how I stack up against them.

For my most important 2 word term (widgets services) there's one site that I would estimate is very strong (sitewide) for widgets but not very strong for services say 100% widgets 20% services. This is bobbing about in the #1 - #3 slots. My own site is 100% services, 65% widgets and is in the #1 - #3 slots for this term. I'm starting to wonder if the Google algorithm isn't sure but is trying to learn which of the two words is most important in the two word term. I suspect that over time it will realise that widgets is the general part and services is the specific part. It will therefore work out that when people are searching for this two word term they are more interested in the specific but only when it applies to the general. Over time sites that are wholly focused on the two word term or are balanced more towards the specific will rank higher. Personalised search will tell them that the specific is a more satisfactory answer than the general.

Cheers

Sid

Just to add. I also notice when I click on More>> it says "These should reflect the subject matter of your site." Perhaps this indicates a semantic topicality metric is used in the new algorithm.

tedster

6:40 pm on Dec 31, 2009 (gmt 0)

"These should reflect the subject matter of your site'

I always took that to mean if you see off-topic terms, then you've got problems to fix.

Hissingsid

6:59 pm on Dec 31, 2009 (gmt 0)

I wonder if it is more about the (mini) semantic web of your site.

Mwaka Mzuri

Sid

tedster

7:42 pm on Dec 31, 2009 (gmt 0)

Perhaps this indicates a semantic topicality metric is used in the new algorithm.

I'd agree with that - no matter what that "should reflect" message means. Semantic information has been growing in the algo for a while. The Phrase-Based Indexing patents [webmasterworld.com] laid out a lot of Google's approach.

In addition, there are query-term taxonomies that are also infused with user intention data. There's a quite a macrame of semantics going on, and I'm hoping that the Caffeine infrastructure allows all of it to be recalculated more frequently.

Hissingsid

9:22 pm on Dec 31, 2009 (gmt 0)

Hi Ted,

You don't do Swahili eh?

Happy New Year!

Sid

aristotle

9:55 pm on Jan 1, 2010 (gmt 0)

These are "sitewide keywords". In a well-designed site, the pages support each other. The effect can also be seen in the traffic you receive from long-tail search terms.

This 42 message thread spans 2 pages: 42