Forum Moderators: DixonJones

Message Too Old, No Replies

interpreting raw log files

How can a dummy like me use webalizer or awstats?

         

Onders

11:54 am on Apr 8, 2005 (gmt 0)

10+ Year Member



Hi, am having a few problems with my stats package included with my hosting package. I have access to the raw log files (web.log.1.gz) so have downloaded both awstats and webalizer to try and interpret the results. Have tried reading the manual and instructions but got horribly lost... Is there any easy way to interpret the results?! Thanks

benni_203

1:14 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



I had a similar problem (I am a newbie).

Why don't you try an offline weblog analyzer? There is software (some free ones, some commercial) which you can install on your own PC. I am working with the free WebLog Expert Lite. I am simply copying the Logs via FTP to my local PC and direct the the SW to the local path where my logs are. It is basic, but gives me some ideas.

Benni

P.S. I am still little confused about the log file format with my two different webhoster. One is having e.g. mydomain1111017600.gz AND referrer_log.1111017600.gz, the other one is just having e.g. access_log_2005-03-07.gz which gets me confused, but the SW apperently is fine with it. If anyone has inputs on that one, I would be thankful

Onders

1:23 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



thanks benni - free weblog expert lite is actually paid software with a 30 day free trial. Will definitely try it out, but do you know of any free software out there as well?

benni_203

1:36 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



Hi Onders, not that I understood it wrong: WebLog Expert Lite should be freeware. The 30 days free trial is WebLog Expert. I am usually very careful not abuse free trial software (I even licensed my WinZip ;-))

Benni

Onders

1:51 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



Cool! Cheers for the info. Just run it on my raw weblogs and got stats for the last few days. On wednesday Awstats suggests I got 1700 visitors, whilst Weblog Expert Lite says I got 2500.. Interesting! Will see what the conversion rates are like.. Can anyone explain the difference though?

JsnGrnlw

5:35 pm on Apr 8, 2005 (gmt 0)

10+ Year Member



I think benni_203 has the right idea with the offline analysis but I would suggest NetTracker Lite.

It is completely free and it'll go out and get the log files for you so you don't have to manually get them yourself.

I've been using it since it was release a few days ago and I've been blown away.

If you want a link just send me a sticky mail and I'll send it over.

-Jason

Receptional

5:42 pm on Apr 8, 2005 (gmt 0)



Awstats suggests I got 1700 visitors, whilst Weblog Expert Lite says I got 2500.. Interesting! Will see what the conversion rates are like.. Can anyone explain the difference though?

Exactly why I don't use log files [webmasterworld.com]

McElvoy

2:28 am on Apr 13, 2005 (gmt 0)

10+ Year Member



Let's please not blame everything on log files all the time. That's getting old. We could try answering the question.

The issue of differences between results for different stats programs has been discussed here before and I'll try to find some of those old items because they had some very good posts.

Some programs filter out spiders and bots; others don't. Spiders and bots can be up to 50% of your traffic.

Some programs use semi-sophisticated ways of tracking visitors such as a persistent cookie. Others try to use a session cookie, mistakenly. Others use a concatenation of the IP address and the User Agent field. Others just use the IP address. The difference between visitor counts, for a busy site, using the first vs. last method can easily be 2x.

Some programs that try to use cookies will be clumsy about matching the first cookieless hit to the rest of the visit, and can count those first hits as separate visitors.

Some programs don't ignore images and so will count as visitors any orphan hits to images, such as images displayed in an e-mail or on another site.

It's also possible that you are misreading their output and you are accidentally comparing visit statistics with visitor statistics.

And so on. Your two programs aren't wrong, they are just using different methods and assumptions, and they're too low-level to either tell you the possible differences or let you change how they work. What they do is not magic. You can do pretty much the same thing with Excel and an investment of some of your time, in fact everybody do that at least once in their life. It will help you realize how different programs come up with different results.

Regarding the different log file names: these are just different ways Apache servers store and name logs, and one method is older than the other. They may or may not contain exactly the same kinds of information, depending on how the hosting people bothered to set up logging. Open them up in a text editor or Excel (they are space delimited) and puzzle out the contents. Search for articles on "Apache log file formats" for help in knowing what the fields are. Even primitive log analyzers can handle the different formats as they are among the most common formats in the world.

benni_203

2:05 pm on Apr 13, 2005 (gmt 0)

10+ Year Member



Let's please not blame everything on log files all the time. That's getting old. We could try answering the question.

The issue of differences between results for different stats programs has been discussed here before and I'll try to find some of those old items because they had some very good posts.

Some programs filter out spiders and bots; others don't. Spiders and bots can be up to 50% of your traffic.

Some programs use semi-sophisticated ways of tracking visitors such as a persistent cookie. Others try to use a session cookie, mistakenly. Others use a concatenation of the IP address and the User Agent field. Others just use the IP address. The difference between visitor counts, for a busy site, using the first vs. last method can easily be 2x.

Some programs that try to use cookies will be clumsy about matching the first cookieless hit to the rest of the visit, and can count those first hits as separate visitors.

Some programs don't ignore images and so will count as visitors any orphan hits to images, such as images displayed in an e-mail or on another site.

It's also possible that you are misreading their output and you are accidentally comparing visit statistics with visitor statistics.

And so on. Your two programs aren't wrong, they are just using different methods and assumptions, and they're too low-level to either tell you the possible differences or let you change how they work. What they do is not magic. You can do pretty much the same thing with Excel and an investment of some of your time, in fact everybody do that at least once in their life. It will help you realize how different programs come up with different results.

Regarding the different log file names: these are just different ways Apache servers store and name logs, and one method is older than the other. They may or may not contain exactly the same kinds of information, depending on how the hosting people bothered to set up logging. Open them up in a text editor or Excel (they are space delimited) and puzzle out the contents. Search for articles on "Apache log file formats" for help in knowing what the fields are. Even primitive log analyzers can handle the different formats as they are among the most common formats in the world.

That was a very informative post. Thank You!

Benni_203

Onders

2:30 pm on Apr 13, 2005 (gmt 0)

10+ Year Member



I'll second that! Thanks McElvoy

Mardi_Gras

6:51 pm on Apr 13, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> Let's please not blame everything on log files all the time. That's getting old. We could try answering the question.

You must have missed the link in Receptional's post. If you check it, you will discover that he is quite knowledgeable about visitor tracking, and did, in fact, "answer the question" as you put it, quite some time ago. In fact, he "answered the question" before most of the posters in this thead had ever heard of WebmasterWorld :)

McElvoy

12:34 am on Apr 14, 2005 (gmt 0)

10+ Year Member



1. Pointing somebody to an old item that goes on for SEVEN PAGES is not helpful.

2. Just saying "exactly why I don't use log files" does not help the user. A question was asked and an appropriate response would be, well, an answer, not a self-oriented editorial remark. The responder was capable of an effective answer but instead chose to make a sweeping editorial remark.

3. The old item started by equating log files to the absence of cookies, which was outdated and wrong long before the old item was written.

4. The old item did not answer the user's question even indirectly, much less correctly or helpfully.

Mardi_Gras

1:10 am on Apr 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>1. Pointing somebody to an old item that goes on for SEVEN PAGES is not helpful.

This is getting off the topic of the original post.

cgrantski

4:17 pm on Apr 14, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I think that's what McElvoy is saying, namely that the comment about "that's why I don't use logs" was off the subject in the first place. I'm ready to move on ....

Reid

7:02 am on Apr 16, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



using different log file analysers will always give you different numbers - thats a given.

simply because they use a different algo to interpret the logs.

I personally don't use server side tracking because it is adding even more overhead to the browser load time for indifinitive data.
cookies are disabled java is disabled ect ect

so i trust server logs more - not for tracking users but for tracking page requests. which pages are more active - search engine queries can give you a good idea what users are looking for.

I use analog - it is a free program that you can use on your home pc. It is a HUGE learning curve to set up but not impossible. What i like about it is that it is totally customisable. I can glean whatever reports I want (once I set it up)

The only way you can use any tracking sofware is to consistantly use the same method and observe trends.
It is not totally accurate info but trends can be seen (provided you are using the same sofware to measure those trends).
Once you switch to a different system you can't compare it with your old one - you have to start over.

So pick one you like - hopefully that shows some detail and use it over a period of time before you can make any conclusions, don't think about 'how much traffic' think more along the lines of 'traffic going up traffic going down' traffic from bots vs from browsers.

Time spent on site by viewers (going up or down)
cached pages ect ect.

devi8or

10:28 am on Apr 21, 2005 (gmt 0)

10+ Year Member



What about those of us who track Stats for Security purposes?

That is the only reason my server does logging. For me, how many people come to my site is irrelavent, I am more concerned about some Bonehead trying to hack/attack my server.

I do visitor logging/analysis the old fashioned way, every hit is sent to a .txt file, at the end of each day (or when I feel like it), I take the days logs, and put them into another file and archive it for myself and the Security Admin. (This procces takes about 3-5 min depending on how busy the server was.)

Reading these made me want to put in my 2 cents, but that is what forums are for, right?

Anyways, y'all have a good day...

-- The DEVI8OR