Forum Moderators: DixonJones

Message Too Old, No Replies

Javascript referrer tracking vs Log File referrer tracking

There are big differences!

         

mimmo

2:07 am on Nov 25, 2005 (gmt 0)

10+ Year Member



After having installed Google Analytics on my website, I started to compare javascript tracking code vs standard log file analysis. Here are my results, which I thought may be of interest to other people ... :-)

How they work:
A) Google Analytics is a tracking service based on Javascript. Everytime a page on your site is requested, some java code is executed. The code collects information like referrer, screen resolution, url, etc. Then sends this info to google via a request to an empty image [google-analytics.com...] The way the info is passed to google analytics is via a query string, eg [google-analytics.com...] This will end up in the google log file. The referrer information is collected in javascript via document.referrer and encoded in the query string. The millions of query strings in the google analytics log files are then decoded hourly by a google engine and presented to users as reports.

-- Log file entries are populated based on the HTTP headers, which are part of the HTTP protocol. One *optional* header is the referrer info. It is optional info and not all browser will send this. Many Internet Security suites block this info (eg Symantec)

PROS and CONS
-- Javascript tracking services such as Google Analytics only track visitors that are using a browser that supports and executes javascript. CON: Google Analytics cannot track visitors with Javascript disabled. PRO: Robots and other spiders do not contaminate the stats.
-- Javascript code is better at identifying unique visitors, as it does not rely on IP, but on other info not available in the log files.
-- *** There are much fewer tools blocking document.referrer in javascript than programs blocking the referrer HTTP header *** So with a javascript solution you have better referrer info. And referrer info also includes keyword tracking, since keywords are included in the referrer field.
-- In the log you have more info available, such as completed downloads, error pages, robots.

In conclusion you will need both! I was surprised by the fact that with javascript you will get better "referrer" information. But it makes sense to me since it is far more complex blocking the document.referrer info as opposed to blocking the optional referrer HTTP header.

Receptional Andy

3:58 pm on Nov 25, 2005 (gmt 0)



it is far more complex blocking the document.referrer info as opposed to blocking the optional referrer HTTP header.

I'm not so sure about that - while having javascript enabled and preventing access to document.referrer is not that straightforward, disabling javascript is easy than blocking HTTP_REFERER. IMO the question then is whether more visitors block HTTP_REFERER or disable javascript (or block tracking scripts like Google's).

One *optional* header is the referrer info. It is optional info and not all browser will send this

While the header is optional, there aren't many browsers that don't send this info by default (unless they are configured not to do so or are prevent by 'security' software).

I would also speculate that people blocking HTTP_REFERER for 'security' reasons are not too many steps away from blocking Google's urchin javascript since the script is easy to identify.

In any case, a javascript tracker should always have a <noscript> alternative to ensure as many visitors as possible can be tracked.

In conclusion you will need both!

Can't argue with that! An ideal system would potentially compare both the HTTP header and javascript referrer information.

genem

4:21 pm on Nov 25, 2005 (gmt 0)

10+ Year Member



Great find, mimmo!
I have a simple php server-side tracking script so I guess the next step will be to combine it with hommade javascript tracker and finally get the whole picture.

You wrote that Javascript trackers use more complex session identification data. I wonder what it is. A combination of ip and browser version?

mimmo

9:16 pm on Nov 25, 2005 (gmt 0)

10+ Year Member



I'm not so sure about that - while having javascript enabled and preventing access to document.referrer is not that straightforward, disabling javascript is easy than blocking HTTP_REFERER. IMO the question then is whether more visitors block HTTP_REFERER or disable javascript (or block tracking scripts like Google's).

You are right. But in my comparisons I found out that more referrers are missing from the log file than from the javascript tracking. So it seems that more people are actually blocking HTTP_REFERER rather than disabling javascript (at least for my user base). Disabling javascript can really impact functionality on many websites, that is probably why.... some advanced internet filters can change document.referer with document.url when parsing HTML ....


While the header is optional, there aren't many browsers that don't send this info by default (unless they are configured not to do so or are prevent by 'security' software).

I was surprised to find out, in my website log file stats, that at least 10% of my AdWords customers do that. Utilities like the Norton security suite do that by default if I am not mistaken.


I would also speculate that people blocking HTTP_REFERER for 'security' reasons are not too many steps away from blocking Google's urchin javascript since the script is easy to identify.

I installed the script locally and changed its name already :-) not for this reason but for some other testing I am doing :-) it is still workig fine!

mimmo

9:21 pm on Nov 25, 2005 (gmt 0)

10+ Year Member



You wrote that Javascript trackers use more complex session identification data. I wonder what it is. A combination of ip and browser version?

They have all the info that is present in the log file + cookies, session id, screen resolution, browser version, OS version, etc.