- maybe it was me - sometimes i'll actually search on what seems like a real human question vs just keywords.
- also consider someone might cut and paste something weird from another document to search on.
- don't forget that search suggestions might come into play and i've seen some pretty long natural language suggestions - just today aamof.
Well, it could have been someone, but it was a pretty abstract phrase and it was asked over 150 times but only over a three-day period.
Maybe it isn't that important really, but I would like to know IN GENERAL how accurate keyword tracking is in GA.
Planet13 - I'm asking myself the same thing.
I have a phrase "widgets & cogs" ("widgets%20%26%20cogs") show up in Google Analytics all from the same three cities (Huntington Beach, Long Beach and New York City). The time on site for Huntington Beach is 0.0 seconds and the bounce rate is 100%. The others have time on site of less than 13 seconds. The "source" is yahoo.com
So, for example, on June 26, I have 8 hits for "widgets & cogs" coming from yahoo.com geolocated to Hutington Beach.
When I check in my server logs, however, I find
- 1 referrer with the phrase "widgets & cogs" and it is from google.com
- not a single entry with yahoo.com in the referrer
So the GA data and the raw log data are consistently at odds.
So actually this phrase is "kw1 & kw2 kw3"
GA says that "kw1 & kw2" got 300 hits in the last month from Yahoo geolocated to Huntington Beach and Long Beach.
The raw logs show not a single search for "kw1 & kw2 kw3". There are rather three searches for "kw1 & kw2" and one search for "kw1 & kw2 kw3 kw4"
So to answer your question - very inaccurate it would appear. The question is why?
I have found that long, unlikely yet popular phrases are caused by Google autocomplete.
Not sure how they get in the autocomplete list in the first place given they seem unlikely, but I've had it happen more than once; "why are so many people searching on that?" start typing it in google - ohhh.... :)
If the site is very busy then GA can do sampling of data, so only a percentage of data is processed for some reports, and then uses that to estimate numbers. This can be one reason behind high numbers of wierd searches.
GA used to make it very clear that sampling was happening but recently I've seen it sampling more and more without reporting it.
A way to reduce sampling on busy sites is to get reports that are for shorter timeframes and then stitch them together yourself (we do this with php/mysql but you could use excel if you don't do it often). We have seen considerable differences in the search term figures when taking 30 separate reports for a day each versus a 30 day report.
Thanks for the tip, inbound!
It's one of the things I hate about GA and their tools. Only way to know is to have another stats package installed and compare. StatCounter is good about giving you accurate information for individual visits instead of just showing trends.
One thing to consider is that (in general terms) Google Analytics will use the last known source/medium combination if a new one is not provided.
For instance: I arrive to your site through a search with "kw1". That keyword will get credited with one visit.
Then I go to the site through "kw2", and that keyword gets credited with a visit. But this time I bookmark the page or I memorize the URL... from that point on, every time I arrive directly to your site, "kw2" will get credited until I come through a different source / medium.
The theory behind this is that GA would like to provide some information rather than nothing. You can configure your code to avoid this, but I surely you can see the value.
Another possible explanation is that the long and unnatural keywords come from sites that participate in adwords's search partners network.
For some of these sites, google doesn't report the queries made by users but a string of keywords taken from the navigation or filepath of the page on which the adsense ads were shown.
For instance a keyword like "category brandA brandA widgetA sale" could come from a url like /category-brandA/brandA-widgetA/sale.
I haven't seen this for natural language looking types of queries but it's not unthinkable.