tedster - 4:40 am on Feb 6, 2013 (gmt 0)
This referer issue has been around ever since browsers were created - and the specifics have changed but continue to show a of of weirdness. We had a discussion of this back in 2002, when Brett_Tabke summarized some of his research:
There are so many variations on referrer behavior from browsers, that if you are within 20-30% of reality you are doing good.
If your log file will allow you to do it, throw out everything but the first visit for any user. Only use those referrers. That will give you the most accurate account.
Other things that will throw off referrers:
- some browsers will only send the root domain for any site.
- some browsers and proxy servers will repeatidly send an external referrer for EVERY page it visits. If it comes in from Google, and they visit 20 pages, all 20 pages could see that same google referral string sent.
- Most clued in Opera users turn off referrals as a security precaution. Mozilla may have an option to do the same soon. They are arguing about it now.
- I have heard that there is a version of msn IE that will not report an external referral under some security settings (not sure, but the pattern fits).
- Revisits. If a page is reloaded, some browsers will sent that page itself as the referral. hence, the high proportion of www.mysite.com in your logs.
- no cache mania. Most of the dsl, cable, and other high speed modem manufactuers are telling people to turn off caching in their browser. They all have explicit details on their site as one of the setup steps to take. That in turn is skewing referral numbers as even a simple back button can cause a page reload. That referrer will often be the previous page.
It's been my experience that 50 to 75% of insite referrals are not correct. Bookmarks, typed-it-ins, drop down history from address bar, caching, no caching, and reloads have turned insite referral numbers to junk. There are no major log file analyzers that have this fact figured out.