Forum Moderators: DixonJones
I switched over because I found LiveStats to be EXTREMELY slow, and the stats it provided weren't very useful or informative.
Right now, both programs are analyzing the same Web site and LiveStats is reporting a huge difference in unique visits compared to awstats. Where awstats reported around 6700 visits, LiveStats reported 14,000. It works out to be inflated about 48%. Interestingly, the number of total Web hits is about the same, around 250,000.
Of course my client prefers the old numbers! I believe it's simply the fact that awstats is more accurate, but I'm just guessing.
Does anyone have any explaination for such a wide discrepancy?
Right now, both programs are analyzing the same Web site and LiveStats is reporting a huge difference in unique visits compared to awstats. Where awstats reported around 6700 visits, LiveStats reported 14,000.
"You can't tell how many visitors you've had. You can guess by looking at the number of distinct hosts that have requested things from you. Indeed this is what many programs mean when they report "visitors". But this is not always a good estimate for three reasons. First, if users get your pages from a local cache server, you will never know about it. Secondly, sometimes many users appear to connect from the same host: either users from the same company or ISP, or users using the same cache server. Finally, sometimes one user appears to connect from many different hosts. AOL now allocates users a different hostname for every request. So if your home page has 10 graphics on, and an AOL user visits it, most programs will count that as 11 different visitors!"
From "How The Web Works" by Stephen Turner
[analog.cx...]
A document that is *very* highly recommended for an understanding of web site statistics based upon log file analysis.
Matt
Given the shortcomings of what gets into logs, there are still many reasons why one perfectly good software package will produce different results from another perfectly good software package. We've talked about two different statistics, hits (possibly RTMac really means page views) and visits. RTMac says hits are about the same for the two stats packages, but visits are wildly different, so let's stick with that one measure: visits. And let’s assume the other measure really is hits and not page views; hits includes images, pdfs, .js files, .css files, and so forth.
Suppose program A filters out images etc before clumping hits into visits while program B does not. Then program A will show fewer visits in those cases where somebody else’s site is calling one of your images or scripts without actually sending visitors to your site. Program B will be counting any of these references to a non-page file as a visit and program A will not. I’ve seen a few cases where this kind of image makes an enormous difference.
Suppose program A doesn't count a visit if it consists of a 404 and nothing else, while program B still counts it. Program A will show fewer visits, especially if there's a 404 on a page referenced by a big-traffic link on another site.
Suppose program A considers a visit “closed” after 30 minutes of inactivity while program B uses 15 minutes. Then program A will show fewer visits because it will not be as likely to count a distracted person’s single visit as two visits.
Suppose program A is smart enough to know about AOL’s many proxy IPs and program B does not. Then program A will show fewer visits because an AOL visit won’t be broken up into lots of little visits. Of course, program A could also over-consolidate, which is why session cookies are nicer than IPs for determining visits. This is a big issue, affecting far more IP’s than just AOL’s.
Suppose program A combines the IP with the User Agent field in the logs, while program B just uses IP address. Then program A will show more visits because it will have less of the over-consolidation. In other words, it will be more able to detect different users coming from the same IP.
These are just a few of the more common reasons, and I can think of maybe 5 or 6 more. Both programs are doing their math correctly, but they are working with different assumptions and processes. The “correctness” of each number depends on what you want to account – do you want somebody who stops looking at your site for 16 minutes to be counted as 2 visits? Do you want to know about those other sites that are calling one of your graphics? Do you want to know about attempted visits that hit a 404 and quit?
All of this is one reason why, perhaps, you get what you pay for in a statistics program. I’m not saying that free or cheap ones are never as smart as not-so-cheap ones, but you have to admit that if you have revenue to pay for the extra programming required to deal with all this and the extra writing to document what’s going on, you have a somewhat better chance of producing a somewhat better program.
I like using a 1x1 pix non-cacheable image that's actually a php file for keeping stats - number of pageviews, which page, and referrer, sticks this into a db. Easy enough to implement by just adding a couple of lines to your page, I do it with js.
This also keeps track of the 'real referrers' which is nice - the actual urls of the site users come from - most stats packages I've come across don't re-assemble page request strings into actual url's, so you have no idea of which page off of google your visitor came from, even though you might know the search terms used.
The best you can do is not to think of these figures as anything close to the number of human beings using your site. Ultimately, that's the figure we're all interested in, but logs as well as cookies simply fail to deliver, for a multitude of reasons.
I would recommend, though, to trust cookies over log files anytime. They will generally be closer to the truth, and (depending on specific methods) they might be the closest bet there is.
>> There was a good thread or three on this
In December, Receptional made a very nice post with several good points here:
How to track visitors [webmasterworld.com]
Do follow the links to earlier threads i posted in msg#2 - if you mangage to read through that collection of about 30 threads or so (minor threads are omitted, some of those selected are very long and very informative), you'll know quite a bit about the pitfalls, as well as alternative products/services.
I feel as if I've just left a seminar on Web stats! Thank you to all of you that contributed.
Although I realize I've barely scratched the surface of all the variations and interpretations of Web site statistics, I can now give my client a much more educated reply to her questions about different results.
Thanks again!