Log Analysis

Forum Moderators: DixonJones

Message Too Old, No Replies

Log Analysis

for local machine - in depth aggregate analysis

lorax

8:41 pm on Jan 18, 2003 (gmt 0)

Hola,
I've a need to analyze extended log files (2 years worth) and would like to do it on my local machine. I do have a webserver running for development work and I don't want to do this work on my live webserver. The tool must be able to do custom reports.

I've seen an earlier thread re: log analysis but it seemed to be centered on production servers. What I want is to be disconnected from the web while I analyze these files. Suggestions?

bcc1234

9:14 pm on Jan 18, 2003 (gmt 0)

Use any tool you like, just use it on your desktop.

lorax

2:03 am on Jan 19, 2003 (gmt 0)

*sheepishly* It's that simple? I thought most of them like to have a live IP or DN to work with?

bcc1234

2:32 am on Jan 19, 2003 (gmt 0)

If your app tries to resolve host names then you need an active tcp/ip stack and a dns server (pretty much just be connected to the internet).

Also, some log analyzers try to access the target site and check pages or images or other stuff - that would also require an internet connection.

You don't need it for anything else a log analyzer might do.

bill

3:01 am on Jan 19, 2003 (gmt 0)

I run Analog on my local PC and run the logs locally for several sites. It's completely customizable and should work for you (it's free). The biggest drawback is its DNS lookups which you might want to delegate to a free utility called QuickDNS made by AnalogX (no connection).

lorax

4:29 am on Jan 19, 2003 (gmt 0)

Thanks guys, I just downloaded and installed NetTracker and WebTrends this evening for comparison.

NetTracker worked like a charm - downloaded my log files and am now working with them off-line. WebTrends on the otherhand...problems with install. Will have to try again in the morning.

ThomasAJ

12:32 pm on Jan 19, 2003 (gmt 0)

Bill

Does Analog show referrals by search engine without having to define search engine URLs to it. e.g. does one have to list the search engines in a file that it uses, say "google.com, google.ca, google.xyz etc..

Similar question for spiders. Does one have to list all common spiders and spider sources in order to know spidering details.

Last question; Does it show "time spent" on web pages.

Thanks

Tom

lorax

1:26 pm on Jan 19, 2003 (gmt 0)

Update - FYI,
NetTracker wins hands down. The interface is more intuitive and it doesn't use pop ups. I also like the colors better. ;)

But seriously, it allows me to follow even one person through the site, to see what they visited in the order they visited it. Good way to follow Googlebot and learn how well I did. Couldn't find similar in WebTrends - unless I'm not looking in the right place. I'm using NetTracker Pro and WebTrends Reporting Center.

chiyo

1:31 pm on Jan 19, 2003 (gmt 0)

lorax, I'm assuming you know the problems with validity of both "path" data and time on page data using raw logs as the input? Getting to that depth of analysis compunds all the already "iffy" data.

Cacheing is the main culprit. We did look at this many months ago and concluded that the validity of the data was very suspect, not matter what precautions you took, though the graphs sure looked "pretty" for clients!

Using sessions and click tracking via scripts seems to hold the most promise for tracking paths better but of course does have its downside as well, especially in page loading times and resources required.

bill

1:39 pm on Jan 19, 2003 (gmt 0)

Does Analog show referrals by search engine without having to define search engine URLs to it.

There's a guy who who does this for you so all you have to do is download a text file when it's updated (Israel Hanukoglu's SearchQuery.txt [science.co.il] file) I've made a few contributions to his list...it's pretty comprehensive and updated at least once a month...

Does one have to list all common spiders and spider sources in order to know spidering details.

No, it will simply tell you what spider have been visiting you...it's up to you to find out what they are. There are some good ROBOTINCLUDE suggestions in this thread [webmasterworld.com]...and I think there are a few more here I can't immediately find.

Does it show "time spent" on web pages

There may be a way to do this, but I haven't used it.

lorax

1:53 pm on Jan 19, 2003 (gmt 0)

Chiyo, I'm not sure I'm aware of those issues and thank you for bringing them up.

If the webserver knows that someone on IP A visits index.htm, and 2 seconds later someone on IP A visits page1.htm I am under the impression that the software assumes it is the same person and considers that a gossamer trail. NetTracker does allow me to set a session timeout length so if I choose this I expect it will change results somewhat.

With regards to caching - hmmmm I hadn't thought of that but that's not that important to me. If they use cached pages - no big deal, it means they've visited the site at least once. My pages do have a cache timeout (not perfect I know) and my content is more informational than saleable.

[edited by: lorax at 4:04 pm (utc) on Jan. 19, 2003]

madcat

4:50 am on Jan 24, 2003 (gmt 0)

bill- this may be simple enough, but I have several websites running on my hosts computer...How do you use Analog to monitor multiple sites? That is if you don't have control of httpd.conf to change add Virtual Machines (I think)?

bill

4:56 am on Jan 24, 2003 (gmt 0)

madcat I haven't run into that as my setups produce completely separate log files. Maybe someone here has worked with that. Otherwise I'd suggest taking a look through the on-line help docs for Analog. They're quite extensive and easy to navigate. Maybe there's something in there?

Woz

5:07 am on Jan 24, 2003 (gmt 0)

I had to parse some Webstar logs that included logs for all the virtual sites on the server. The way to restrict to one site was to use

VHOSTINCLUDE www.sitename.com

which seemed to work. Set up a number of .cfg files and each tweaked for a particular site and try that.

Onya
Woz

jlr1001

12:28 pm on Jan 28, 2003 (gmt 0)

I have several websites running on my hosts computer...How do you use Analog to monitor multiple sites?

I just figured this out. As Woz suggessted, I set up a number of cfg files. However to keep Analog from still using the main Analog.cfg file I created shortcuts--I'm using Windows, but you can easily do this in anything as long as you can get to a command prompt--for each client website.

In the command line for each short cut I use the -G flag that basically tells Analog not to look at its default config file.

Then, say this particular client is Widget Bazaar and I created a specific widgebaz.cfg file, after the -G flag I added +gwidgebaz.cfg which basically tells Analog what config file to use (or add to the default config file if you didn't include the -G flag).

I created windows sortcuts for eaching client, using the above format, pointing to the main Analog.exe file so that I don't have to type the appropriate commands at a prompt each time I want to run a specific analysis.

Hope that's not too long-winded.

-jlr1001

Powdork

9:53 am on Jan 30, 2003 (gmt 0)

Hi,
I just tried out Net Tracker after hearing Lorax's good reviews. I'm very impressed. Anyone got a coupon?;)

palmpal

10:18 am on Feb 2, 2003 (gmt 0)

Wow, two years worth of log files! How did you capture this data? My logs are rotating and disappearing! It's been frustrating trying to get an answer from the company!

Thanks.

jamesa

11:38 am on Feb 2, 2003 (gmt 0)

If the webserver knows that someone on IP A visits index.htm, and 2 seconds later someone on IP A visits page1.htm I am under the impression that the software assumes it is the same person and considers that a gossamer trail.

Just beware that the software needs to make many assumptions so for that reason what you're getting is not hard data but rather just a guestimate.

I've noticed that the IP address of AOL users can change with every request. Also there can be many people accessing your site from behind a router so they all will show the same IP. And defining how long someone spent on a page is very suspect - they could of just walked away from the computer or answered the phone.

NetTracker looks nice, btw. Thanks for the tip :)

werty

9:03 pm on Feb 7, 2003 (gmt 0)

oh that nettracker is nice.

it is only 11 clicks different for than my overture reports.

webtrends was a about 600 clicks off.

also i like how it tells you the entrance and exit pages.

binki

1:27 pm on Feb 8, 2003 (gmt 0)

I haven't, but I think Summary was written by the same guy, and it beat the pants off another Analog helper app I tried. I'd trust another product from him.

gibbon

1:34 pm on Feb 8, 2003 (gmt 0)

i dont think that clicktracks http://www.clicktracks.com is another analog helper app.

think it is a completely new app. it just looks so stunning!

aspdaddy

2:15 pm on Feb 8, 2003 (gmt 0)

Too many companies sell them on the 'pretty output' factor, but if you are doing any type of content affinity or aggreagete path analysis there are way more important issues.

lorax

4:31 pm on Feb 8, 2003 (gmt 0)

there are way more important issues

Which is one of the reasons I went with NetTracker. I can create my own reports and filters as well as modify the underlying criteria the program uses to analyze the data with. So rather than relying on NetTracker to make the decisions about what is considered a repeat visitor, I can tell it what I consider a repeat visitor.

gibbon

6:24 pm on Feb 8, 2003 (gmt 0)

yes JamJar, the demo looks ace! we are trying to integrate shortly so I have no "real world" opinion of it.

Yet it looks quite a bit cheaper than NetTracker and the guy from analog seems to have wrote it. What more could you want!

Hilary

11:52 pm on Feb 8, 2003 (gmt 0)

The pictures certainly look very pretty. But if you can download your own log files...

...oops, just read the charter about no specific software discussions, and deleted a comparison of price and features. A Google search on 'Summary log analysis' will enable you to make your own comparison.

karakas

6:45 am on Feb 9, 2003 (gmt 0)

I guess using that piece of JS code enables you to do a finer analysis of your visitors' traits in your website than just with log files. But I didn't try, so I am interested in the merits of such a technique. Can we expect it to give us really an added value to what an ordinary log file already offers?

aspdaddy

10:21 am on Feb 9, 2003 (gmt 0)

>I am interested in the merits of such a technique

Well generally these record only actual page views, so there is much less data to analyse afterwards. It also means you dont have to do as much 'cleaning' of the data.

But this approach can be very limiting as well, especially if you are using images to track newsletters/adverts etc , you have missed out on a lot of potential data.

Hilary

11:54 am on Feb 12, 2003 (gmt 0)

I'm starting to get really embarrassed about the way I keep harping on about Summary. I'm not, honestly, connected with them except as a customer. Honestly.

Anyway, the web analytics tutorial that comes with their software is also available on the site: search Google for 'web analytics tutorial'. Of course it's based on the capabilities of Summary, but it still covers what you can look for in log files:
- how many visitors, and when
- where visitors come from
- what search engines and search phrases are sending the best visitors (ie the ones who subscribe, spend money etc)
- what visitors do at your site, for instance...
- where they enter and leave from, where they spend longest, which links they follow...
- what browsers they use - etc, etc

And suggests what you might *do* with all this information, which I found very useful.

Also there's a section on how accurate it all is!

Log Analysis

for local machine - in depth aggregate analysis

lorax

bcc1234

lorax

bcc1234

bill

lorax

ThomasAJ

lorax

chiyo

bill

lorax

madcat

bill

Woz

jlr1001

Powdork

palmpal

jamesa

werty

binki

gibbon

aspdaddy

lorax

gibbon

Hilary

karakas

aspdaddy

Hilary

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week