homepage Welcome to WebmasterWorld Guest from 54.205.197.66
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
Forum Library, Charter, Moderators: Receptional & mademetop

Website Analytics - Tracking and Logging Forum

    
Best approach for Unique Visitor and Page Impression
if you had to build your own web statistic service...
Bernie




msg:890988
 8:02 pm on Dec 29, 2004 (gmt 0)

Let's say you want to develop your own web statistic analysis tool that runs on one web server and tracks all your other sites using a piece of tracking-code.

What would be your most favourable approach to correctly define and track Unique Visitors and Page Impressions?

Unique Visitors

You want to deal with Proxy-Servers and/or Browser-Cache. A possible approach could be: count new Unique only if either:

REMOTE_ADDR (IP address of remote client or of proxy server)
HTTP_X_FORWARDED_FOR (IP address of client behind proxy server, when allowed)
HTTP_USER_AGENT (OS and Browser)

are different.

You might also want to implement some piece of source code that modifies the tracking code for each request (for example a time stamp or a piece of random-code). This would make tracking more precise if proxy-servers are involved.

These measures should help to prevent the solution from counting less UV than there are effectively.

Now what about how to avoid counting more UVs than there really are?

First, there is the spider issue. A solution should be to include a list of blocked user_agents and IPs so if one of these IPs/user_agents requests the page the request won't be counted.

Now, there is still the problem of interrupted dial-in connections and IP changes during one session. I was told that some providers like AOL(?) tend to change IPs even in one session.

For both problems, I could only think of cookies as a solution. However, my concern is that these cookies might be considered as third party cookies and therefore be blocked by the browser.

Page Impressions

Here I can think of the issue that a new Page Impression should not be counted if the user simply refreshes the page within a certain time period. Something like 10 seconds might be appropriate.

Now do you think this would roughly do the job? I'd appreciate your comments.

 

Bernie




msg:890989
 5:43 pm on Jan 2, 2005 (gmt 0)

anyone?

cfx211




msg:890990
 7:02 pm on Jan 7, 2005 (gmt 0)

We've built our own system that is cookie based. The outline for it is in this post:

[webmasterworld.com...]

We only use it for one site, but adding a cookie_domain column onto the cookie tables will expand it to multiple sites.

Let me know if you have questions.

Bernie




msg:890991
 8:22 pm on Jan 7, 2005 (gmt 0)

thanks for the link to this in-depth thread. I see that cookies are a good way to increase accuracy. Especially because of the AOL (changing IPs in one session) and dial-in connections (different users get the same IP in one day).

Unfortunately, I will need a system that is on one server and that serves an arbitrary number of websites. This raises the third-party cookie problem because the tracking-server sets the cookie not the server of the website that is using the stat solution.

Maybe there is a work-around to this problem.

cfx211




msg:890992
 7:31 pm on Jan 10, 2005 (gmt 0)

What 3rd party cookie issue do you see? Just about every popular site on the web has third party cookies set in the form of banner ads.

I would not worry about the users that choose to block these cookies as I think that is too small a group to care about. One of the the things I have always argued is that 100% accuracy in tracking should not be your goal. Getting a tracking system in place that helps you identify and measure the goals of your business is. If 2% of people out there are ultra paranoid and do not want people knowing what they do, then let them be. They are probably not contributing much to your site to begin with.

Unless your privacy policy on these sites prohibits you setting 3rd party cookies, you should be ok.

Bernie




msg:890993
 8:41 pm on Jan 10, 2005 (gmt 0)

what 3rd party cookie issue do you see?

maybe we are not talking about the same thing but AFAIK the problem is that:

- tracking code of the stat service (<img src="http://trackingdomain.com/count.php?ID=blabla>) is implemented in a website on a domain other than trackingdomain.com
- the server trackingdomain.com sets the cookie, therefore the cookie is considered a third-party by the browser.
- IE6 does not allow third-party cookies in default privacy settings.
- marketshare for IE6 around 70% (in Germany)

cgrantski




msg:890994
 10:17 pm on Jan 10, 2005 (gmt 0)

That's different. You're talking about Germany? Are any other EU countries requiring IE to be shipped with that default setting? Just curious.

What about making sure you're employing a first-party cookie? Even ASP's can do things with DNS mapping to ensure that cookies are first-party. (Unless there are laws against it somewhere.)

It's amusing that cookies are in most ways LESS revealing of a person's private information than the items Bernie listed (IP, IP behind proxy, OS, Browser, and all the oddball things you find in the UA field). Yet it's cookies that whole countries object to!

Bernie




msg:890995
 7:27 am on Jan 11, 2005 (gmt 0)

What about making sure you're employing a first-party cookie?

great idea, but how can I do it, I'd love to know?

i agree with you: some of the privacy regulations are gaga. when it comes to stat services you only want to make sure your numbers are accurate - you are not revealing the identity of anyone. cookies don't change that.

Yes, IE6 is not a niche browser and in the distribution's default settings 3rd party cookies are definitely blocked.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / Website Analytics - Tracking and Logging
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved