Forum Moderators: DixonJones
I noticed that my Netracker Professional 7.5 log analyzer was not picking up all the Google keywords on initial referrals. I checked and my free version of Webfunnel that I use for quick traffic analysis during the day wasn't picking up all the Google keywords either.
In addition to the above log analyzers I also use some custom macros with Ultraedit, a powerful text editor, for special tasks. I looked into the log files and found incoming google traffic with lines like this:
proxya.scott.af.mil - - [25/Mar/2005:11:27:29 -0500] "GET /online-store/scstore/p-02100.html HTTP/1.1" 200 12728 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&rls=GGLD,GGLD:2004-19,GGLD:en&q=HP+2200+toner+" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
I've bolded the area of the log file that appears to be screwing up my log analyzers. They were not picking up "HP+2200+toner" as a keyword. Running around on the web trying to learn more it appears that these log entries originate from people doing searches with the Google toolbar. There are two formats of this type. The one like the above log file entry and this one:
dsl-KK-static-231.202.95.61.touchtelindia.net - - [25/Mar/2005:08:02:02 -0500] "GET /online-store/scstore/c-Omnifax.html HTTP/1.0" 200 11654 "http://www.google.com/search?hl=en&lr=&rls=GGLD%2CGGLD%3A2004-35%2CGGLD%3Aen&q=XEROX+OMNIFAX" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"
I've bolded the key area of this log file entry also. In reading posts from other people looking at these log entries the best guess was that the letters "rls" stood for "release" and the following "2004-35" was the version and date of the specific Google toolbar being used. Sounds logical to me.
In any event, neither Netracker or Webfunnel could pick up the keywords in these log entries. Losing all the keywords from Google toolbar users coming into my site was screwing up my Adwords ROI analysis and messing with tracking my organic SERP SEO efforts also.
I studied the structure of the entries and wrote a macro using Ultraedit that stripped out the toolbar part of the entry and rewrote it to look like a strandard Google logfile log entry. Because there are two versions of these log entries I had to rewrite separate macros for each format.
Running the macros against three weeks worth of log data was incredibly slow. About 6 hours to run it twice on about 275,000 log entries. On the otherhand it was the only available tool to do this with - search and replace is useless with so many different Google toolbar release dates, etc.
I tested it today with both Netracker and Webfunnel on my traffic today and the difference in key word analysis was incredible. Before cleaning up the log entries they both picked up about 85 keywords. After cleaning up the log file with the macros I wrote to rewrite the Google toolbar log file entries into traditional Google format they picked up 124 keywords!
I am writing this post to alert anyone trying to do keyword analysis on their PPC ads, or anyone doing SEO, about this issue. If someone knows how to solve this problem when using log analzers like Netracker please let me. If anyone has any other insight on this please let me know also.
p86-135.acedsl.com - - [26/Mar/2005:10:43:59 -0500] "GET /online-store/scstore/p-NEC870MR.html HTTP/1.1" 200 11860 "http://www.google.com/search?hl=en&rls=GGLD,GGLD:2004-07,GGLD:en&q=nec+superscript+870+toner&spell=1" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
into a log entry that looks like this:
p86-135.acedsl.com - - [26/Mar/2005:10:43:59 -0500] "GET /online-store/scstore/p-NEC870MR.html HTTP/1.1" 200 11860 "http://www.google.com/search?q=nec+superscript+870+toner&spell=1" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
Please note how it has rewritten the log entry to look similar to the usual non-google toolbar log entry. This is a difficult find and replace issue since the area of the log file being rewritten varies text-wise widely.