"Significance" is slightly misleading since all it shows is the number of occurrences of a word in comparison to the most used word. So, if the top keyword has a hundred occurrences, the "significance" of a word with 50 is 50%. It doesn't say anything that useful at all, although the wording implies that it does.
What I've been doing is flipping back & forth between "Keyword Significance" and "Top Search Queries". In some cases the very top keywords listed are part of the top queries, but in other cases they're not -- am trying to get a better handle on those relationships.
Here's something that's been confusing me for awhile. Under Top Search Queries they have 2 columns, "Impressions" and "Clickthrough". Often the top impression query will be (for example): blue widget position 9 ; but then under clickthrough it will say: blue widget position 4. Same exact query but listed as 2 different positions, which seems contradictory.
I've always though top search queries in general was very buggy. I have a couple sites where it shows a couple ranks I supposedly have that I'm not where close to, and it's been showing them for months.
|Same exact query but listed as 2 different positions, which seems contradictory |
I see that too, but WMT is an average and I always see higher rankings than it shows (yes I'm signed out and I use different machines too).
|I especially take note of this line: |
"These should reflect the subject matter of your site."
Hm. I've seen these for an ecommerce site I have which has nicely varied and natural language to describe products. However the second most common word on the site according to WMT is 'none' because we have a form for each product to select numbers and the default is 'none'. Interestingly enough, 1,2,3,4 etc don't show.
Could it seriously be that I'd be optimising the site to replace 'none' with '0' in the drop-downs? Pathetic.
The keyword thing is strange. My site is centered around a two word phrase. The first word is near the top of the list and the second word is nowhere in the list.
I think they may have a fairly broad "stopword" list, as suggested in this thread [webmasterworld.com].
|"Significance" is slightly misleading since all it shows is the number of occurrences of a word in comparison to the most used word. So, if the top keyword has a hundred occurrences, the "significance" of a word with 50 is 50%. It doesn't say anything that useful at all, although the wording implies that it does. |
Have you tried to click on the keyword shown in "Keywords" section of GWT? Whereas it used to show you the "anchor text" used, now it actually shows the total number of occurrences found and the top pages with this keyword. Not sure how much the data can be trusted though!
|I've always though top search queries in general was very buggy. I have a couple sites where it shows a couple ranks I supposedly have that I'm not where close to, and it's been showing them for months. |
What I have noticed is that ranks shown on the initial "Top search queries" screen could be the one you had on the country specific Google or for a different types of searches (eg. images).
For example, if you are shown to be #3 for "gold widget", then you click on this within the GWT, will show search for "gold widget" on Google.com, whereas your #3 result might have been for the same phrase on Google.it. Or it might have been for the image search.
You can find this by playing with "all searches" and "all Google domains" drop down boxes on the Top Search Queries screen and see which keywords are reported for the specific google domain/search type. From my experience this seems to reflect closer the actual ranking of the keyword.
Thanks aakk9999 for suggesting that we fine tune the GWT output. For our particular websites, we only care about the USA market and mostly care about "Web Search" (rather than Images or Mobile), so I'm getting more specific information now. I also like the way I can click on a keyword in the column on the main dashboard to see the total number of occurences, and which "top" pages it appears on the most -- all of which is useful.
Well - turns out on one of my most important sites, the words 'colour', '0000ff', 'val', and 'tip' are most important. Sigh.
I wonder if it is because Google keeps trying to index the JS files generated by my dynamic graphs.
My WMT Keyword tool currently returns: "No data available. (Why not?)"
Just to clarify this - it doesn't show you which words are most significant or most important - just which ones occur the most. This in itself might imply importance, but that isn't the same thing.
What strikes me is the apparent importance of words in your footer for sitewide relevance. I see part of my last name appearing high on the list!
>>What strikes me is the apparent importance of words in your footer for sitewide relevance. I see part of my last name appearing high on the list!
I noticed this too, I think it is because the term is repeated on many pages.
|My WMT Keyword tool currently returns: "No data available. (Why not?)" |
Is the site relatively new? Or have you only recently added it to your GWT account? I have that same message for one of my sites also, but it only went up in mid-Sept and was only added to GWT last week, so I figure it's taking time for Google to examine all the data and run it through the algo, prior to making that data available to me.
|I noticed this too, I think it is because the term is repeated on many pages. |
Does this imply, that using some crafily constructed sentence as footer, one might be able to gain additional sitewide relevance for important keywords?
Sounds like an interesting idea to me.
|Does this imply, that using some crafily constructed sentence as footer, one might be able to gain additional sitewide relevance for important keywords? |
As far as I can tell, there's no indication of importance at all, other than the assumption that more mentions = more importance which I don't think stacks up to any degree.
The equation used to create the "significance" graph is:
number of occurrences / number of occurrences of most frequent word * 100
Just a percentage calculation.
It's purely based on word count, which means that in many examples I've seen, a large file like a sitemap or RSS feed contributes hugely to which words occur most frequently - with no significant impact on rankings.
|that in many examples I've seen... with no significant impact on rankings |
I noticed this last night. At one time with Yahoo accounts, they recommended adding a text file to your hosting directory, with the full page addresses stacked one on top the other. So a 100 page site would have 100 lines. If your domain included an important keyword, and if your page address often included that same keyword (let's say half the time), then you'd have 150 occurences of that keyword on that single .txt file. This otherwise meaningless file showed up in my GWT account as one of the "top" pages with my primary keyword, which to me confirms what you've said.
My #1, #2, #3, and #9 most "significant" keywords are:
These are actually insignificant words for my site. Should I work to cut back on their repetition?
I also noticed that the top 3 queries in "Top search queries" list positions 6, 5, and 5, but that is WAY off. These positions are listed with "Web Search" and "(United States) google.com" selected, but their actual positions are about 70, 70, and 90. Could this indicate some type of a penalty?
|These are actually insignificant words for my site. Should I work to cut back on their repetition? |
I doubt this would be a productive activity.
I'm cursing Google for using the wording "significance" for this report since I believe it is entirely misleading. It's really measuring something very close to keyword density, but site-wide. A crude and insignificant measurement IMO.
To give another example. For one site, one of the most occurring keywords is "href" based on it's occurrence in a (perfectly valid) RSS feed that Google has misinterpreted. The file itself can't even be found in Google when searching for "href" (it won't show at at all, as it's RSS) as is the case for many of the other files listed.
I find the report useful for what it implies about Google's processing, rather than because it tells me anything actionable about keyword usage in general.
Regarding ranking discrepancies, I notice they count each image from image search as one result -- but you probably thought of that.
Do your actual ranks correlate with this measure of 'significance'? I suspect not. In which case this tool should be ignored as misleading. Why it is misleading is open to speculation ;)
|I'm cursing Google for using the wording "significance" for this report since I believe it is entirely misleading |
If it is completely irrelevant then it's funny that they would bother making the data available to webmasters. I appreciate that the figures only relate to themselves rather than some benchmark, but I'm still going to do a bit of testing. If you can increase relevancy across the site as a whole then that can't be a bad thing.
Of course, perhaps it is completely irrelevant and a purposeful distraction to get people obsessed with KWD again.
If you frequently use irrelevant keywords in copy, then, of course, fixing that is a good thing. I just don't like the use of the word "significance" when it should really say "percentage". The graphs would be no different if they just used the count of keywords provided.
I have one interesting observation to share. The most common "keyword" reported in GWT on one of sites we are following is a completely made up word which is used as an alt text for a graphic image on the site and which appears 6 times on every page in the main navigation. The graphic itself is kind of "rounded tab corner" and we think that the alt text was originally put there for the page to validate.
Having seen the number of occurrences of this "keyword" in the new GWT Keywords section, we have asked for this alt text to be removed. This removal was done about a week ago.
Now, I would have expected for the number of occurrences of this "keyword" to stay the same for a while and then slowly start to drop as Google re-crawls the pages and does not find this term any more.
What has been happening however, is that over the last 5 days the number of this "keyword" was increased by about 1500. Whilst I know that long term this "keyword" will disappear, the short term increase can only be explained by two things:
1) very buggy GWT Keywords report; and/or
2) Google is re-processing the data it has already got in its index and is dragging info out of it and storing it on a more granular level
The point 2) would go alongside infrastructure / data storage changes with what Caffeine was about.
Looking at the crawl stats of the site, I cannot see at all crawling being increased, in fact, the crawling seem to be at the lower level than the average in the last couple of weeks.
|For one site, one of the most occurring keywords is "href" based on it's occurrence in a (perfectly valid) RSS feed |
I can top that...
I checked two sites. The one seemed pretty good, but strangely, I find words from code appearing all over the report for another site. So supposedly some of my top keywords are
Really? True, this is running on a super bloated drupal theme that I used for prototyping and have intended to recode and streamline, but I didn't expect it would be *that* bad (but both sites actually have the same underlying theme and the one is much better than the other).
I wish I had screenshots from before they started the "significance" thing, but I don't recall seeing these keywords ever before.
It also gets hung up on URLs and doesn't decode encoded URLs properly so if I have a path like
as keywords for my site, failing to recognize that %2f is a url-encoded slash (which, BTW, I think Google is encoding, because I don't think the URLs in question are encoded on the site - certainly none of them are returned by a site: search).
As a general thing, I find that if you have any recurring utility text across many pages, it sees those terms as significant. So, for example, a blog has "comment" and "post" as significant words because every post has "Comment on this post" at the bottom of it.
|a blog has "comment" and "post" as significant words because every post has "Comment on this post" at the bottom of it. |
For my sites I make those sort of repetitive things small graphics, so they don't ever show up in the keywords.
And I must say, this morning as I re-checked many of my sites, GWT is overwhelmingly on target and (with only a few exceptions) is returning exactly the words that I would want them to see as "significant".
Now having said that, I'm NOT saying that I think that accuracy is necessarily helping me one way or the other in the SERPs.
But it sure as heck isn't hurting either, and given the alternative, I'm pleased that they've nailed it as well as they have...
|small graphics, so they don't ever show up in the keywords |
Yeah, but look at my "keyword" list:
Those are being triggered by images or various sorts. So adding img tags would just make things worse I think.
I'll play the devil's advocate with Receptional Andy for a moment on a couple of points.
From Google's POV, when assessing what a website is all about, they must use the keyword copy on the pages throughout the site as one of the biggest signals.
So, it SHOULD be a big red-flag to a webmaster if the main keyword phrases they wish to be found ranking for are not showing up in Google's assessment (indexing) of the site. How frequently that word appears in relation to other words on the same site also likely indicates how strongly the pages could rank when any combination of terms appear on the same page.
So, in that context, yes, this does show some significance.
The implication is clear that if your most desirable keywords and keyword phrases do not appear within the content of your site, your site and pages are far less likely to rank for that term.
Taking off my "devil's advocate" hat, I'd join Receptional Andy side in suggesting that Google really should improve the interfaces and explanatory Help pages to make it clearer as to how such information could be useful to webmasters. Going off the Help page for this, once you've assessed that you don't have unrelated keywords like Viagra, and you do have keywords you want, what do you do about how oddly they've assessed the relative importance or "significance"? In my case some keywords that are not on every single page show as more significant than the site brandname, which does. And, I've long complained that they should filter out some terms that one could expect to appear on a great many sites but be unrelated to the actual themes of the sites, such as "copyright" and "home".
I like devil's advocacy ;)
|So, in that context, yes, this does show some significance |
My major gripe is that Google have merely labelled "number of occurrences" as "significance" - there is no change in the data and it is not processed in any way. There is a case that more occurrences is identical to more significance, but I don't believe it is a strong one.
The other thing which is cropping up in this thread is that the data does not seem to be the same as is used by Google for ranking - hence the removal of many "stopwords" which Google does index and which influence ranking, and the occurrence of parts of code as words which appears to be a simple parsing bug - again, not present in Google results.
| This 42 message thread spans 2 pages: 42 (  2 ) > > |