homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

How Accurate Are Keywords Listed In Google Analytics

 11:27 pm on Jul 17, 2013 (gmt 0)

How accurate - if at all - are the keywords listed in google analytics when you have the secondary dimension set to "keyword"?

some of them seem plausible, but some seem way off...

For instance, i have an advance segment to track visitors who use long tail keywrods (basically when the keyword is four or more words long).

One keyword that showed up near the top of the list was highly implausible:

"what's it called when people do [XYZ] with their [blue widgets]"

Obviously, XYZ and blue widgets are just examples I used.

Any ideas how this pretty obscure phrase could be listed so highly?

Thanks in advance.



 4:45 am on Jul 18, 2013 (gmt 0)

- maybe it was me - sometimes i'll actually search on what seems like a real human question vs just keywords.

- also consider someone might cut and paste something weird from another document to search on.

- don't forget that search suggestions might come into play and i've seen some pretty long natural language suggestions - just today aamof.


 5:59 am on Jul 18, 2013 (gmt 0)

Well, it could have been someone, but it was a pretty abstract phrase and it was asked over 150 times but only over a three-day period.

Maybe it isn't that important really, but I would like to know IN GENERAL how accurate keyword tracking is in GA.


 8:20 pm on Jul 19, 2013 (gmt 0)

Planet13 - I'm asking myself the same thing.

I have a phrase "widgets & cogs" ("widgets%20%26%20cogs") show up in Google Analytics all from the same three cities (Huntington Beach, Long Beach and New York City). The time on site for Huntington Beach is 0.0 seconds and the bounce rate is 100%. The others have time on site of less than 13 seconds. The "source" is yahoo.com

So, for example, on June 26, I have 8 hits for "widgets & cogs" coming from yahoo.com geolocated to Hutington Beach.

When I check in my server logs, however, I find
- 1 referrer with the phrase "widgets & cogs" and it is from google.com
- not a single entry with yahoo.com in the referrer

So the GA data and the raw log data are consistently at odds.


 8:31 pm on Jul 19, 2013 (gmt 0)

So actually this phrase is "kw1 & kw2 kw3"

GA says that "kw1 & kw2" got 300 hits in the last month from Yahoo geolocated to Huntington Beach and Long Beach.

The raw logs show not a single search for "kw1 & kw2 kw3". There are rather three searches for "kw1 & kw2" and one search for "kw1 & kw2 kw3 kw4"

So to answer your question - very inaccurate it would appear. The question is why?


 8:23 am on Aug 18, 2013 (gmt 0)

I have found that long, unlikely yet popular phrases are caused by Google autocomplete.
Not sure how they get in the autocomplete list in the first place given they seem unlikely, but I've had it happen more than once; "why are so many people searching on that?" start typing it in google - ohhh.... :)


 2:30 pm on Aug 22, 2013 (gmt 0)

If the site is very busy then GA can do sampling of data, so only a percentage of data is processed for some reports, and then uses that to estimate numbers. This can be one reason behind high numbers of wierd searches.

GA used to make it very clear that sampling was happening but recently I've seen it sampling more and more without reporting it.

A way to reduce sampling on busy sites is to get reports that are for shorter timeframes and then stitch them together yourself (we do this with php/mysql but you could use excel if you don't do it often). We have seen considerable differences in the search term figures when taking 30 separate reports for a day each versus a 30 day report.


 9:32 pm on Aug 22, 2013 (gmt 0)

Thanks for the tip, inbound!


 1:49 pm on Aug 23, 2013 (gmt 0)

It's one of the things I hate about GA and their tools. Only way to know is to have another stats package installed and compare. StatCounter is good about giving you accurate information for individual visits instead of just showing trends.


 11:10 pm on Aug 23, 2013 (gmt 0)

One thing to consider is that (in general terms) Google Analytics will use the last known source/medium combination if a new one is not provided.

For instance: I arrive to your site through a search with "kw1". That keyword will get credited with one visit.

Then I go to the site through "kw2", and that keyword gets credited with a visit. But this time I bookmark the page or I memorize the URL... from that point on, every time I arrive directly to your site, "kw2" will get credited until I come through a different source / medium.

The theory behind this is that GA would like to provide some information rather than nothing. You can configure your code to avoid this, but I surely you can see the value.


 10:59 am on Aug 26, 2013 (gmt 0)

Another possible explanation is that the long and unnatural keywords come from sites that participate in adwords's search partners network.
For some of these sites, google doesn't report the queries made by users but a string of keywords taken from the navigation or filepath of the page on which the adsense ads were shown.

For instance a keyword like "category brandA brandA widgetA sale" could come from a url like /category-brandA/brandA-widgetA/sale.

I haven't seen this for natural language looking types of queries but it's not unthinkable.

Global Options:
 top home search open messages active posts  

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved