Forum Moderators: Robert Charlton & goodroi
Is that fairly new, or have I been asleep at the wheel? I like it a lot and thank the GWT team for this graphical presentation.
I especially take note of this line:
"These should reflect the subject matter of your site."
That one sentence speaks volumes...
...........................
"These should reflect the subject matter of your site."
So all I can say is - given that sentence from Google - if I had totally unrelated words near the top of that list, and knowing my tendency to worry, I'd probably be concerned. That concern may not be justified, but it would still be there lurking in the background.
............................
The flipside is that if you have a "terms and conditions" link in the footer of a site, that isn't likely to be worth removing purely for making a Google report look nicer. The impact on relevance would be so minor as to make the task of removing it a waste of time.
Google even seem to have filtered some of the most common repeated words from the list themselves, just for this report.
It's still massively bloated HTML that needs to go on a major diet, but I wouldn't expect img, href, alt and rel to be showing up as keywords.
I'll be curious once I clear out some cruft in the template and it gets recrawled whether the situation improves.
What if Google is trying to classify websites in their entirety. So a website that is predominently about a topic, let's say wasps, is possibly more likely to be a better answer for queries about wasps. A general site about insects might not have as detailed info. It's like going to a general bug expert or a wasp specialist. Having a sitewide relevance would be a relatively clean metric, since it's hard to appear to be a specialist unless all of your content indicates so. Spammy websites that try to cover a lot of content will find it hard to be a specialist in any one of them. Could just be one additional metric. Just a thought.
No doubt that is oversimplified and there are lots of other variables, but you get the idea. So if the "potency" is 100%, then it may very well imply more of a "specialist" or "authority". I'd be curious if any one has any data to support this supposition.
................
For my most important 2 word term (widgets services) there's one site that I would estimate is very strong (sitewide) for widgets but not very strong for services say 100% widgets 20% services. This is bobbing about in the #1 - #3 slots. My own site is 100% services, 65% widgets and is in the #1 - #3 slots for this term. I'm starting to wonder if the Google algorithm isn't sure but is trying to learn which of the two words is most important in the two word term. I suspect that over time it will realise that widgets is the general part and services is the specific part. It will therefore work out that when people are searching for this two word term they are more interested in the specific but only when it applies to the general. Over time sites that are wholly focused on the two word term or are balanced more towards the specific will rank higher. Personalised search will tell them that the specific is a more satisfactory answer than the general.
Cheers
Sid
Just to add. I also notice when I click on More>> it says "These should reflect the subject matter of your site." Perhaps this indicates a semantic topicality metric is used in the new algorithm.
Perhaps this indicates a semantic topicality metric is used in the new algorithm.
I'd agree with that - no matter what that "should reflect" message means. Semantic information has been growing in the algo for a while. The Phrase-Based Indexing patents [webmasterworld.com] laid out a lot of Google's approach.
In addition, there are query-term taxonomies that are also infused with user intention data. There's a quite a macrame of semantics going on, and I'm hoping that the Caffeine infrastructure allows all of it to be recalculated more frequently.