Welcome to WebmasterWorld Guest from

Forum Moderators: DixonJones & mademetop

Message Too Old, No Replies

Do you use a homegrown log analyzer?

Or do you use a fancy product?



10:41 pm on Nov 25, 2002 (gmt 0)

10+ Year Member

I wrote my own log analyzer because I can't get what I want using the one that comes with my shared hosting account (WebTrends). And I didn't want to pay for a better one. I figure that's got to be the case for a lot of the members here. I see that a lot of you are looking at raw records from your logs. And you look for specific things like favicon.ico. So do you code your own? Or do some of the log analysis products give you that kind of detail? Don't you have to pay a lot for such a fancy log analyzer? Or are there cheap ones that give you what you want?

Here's an example: MSN has an "origq" term in it's query string. It contains what the visitor searched for before the search that lead to your site. In other words, a referer from MSN has not only what your visitor searched for, but what he searched for before that. I get all kinds of useful info out of this like they people commonly miss spell words. I might see that a few visitors have changed a search word from picture to photo. Then I know that I should have the word picture in that page. Is there a log analysis product that can do that?

Here's another example: Of course, Google is not about to log in to my site like my visitors have to. But I still want Google to index my pages. So I have special code to let Google through. After picking through my logs, I figured out that people were reading my stories for free by using Google's cache. I can see this because sometimes the referer to my images comes from Google instead of one of my own pages. If I follow the referer, I see the cached version.


7:21 am on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

I use M$ Access. It is the only software I found that is able to cope with the size of a log file. Then I use different queries to see the relevant entries (like user-agent = Googlebot).

As for your problem with the google cache, try using this tag:



7:27 am on Nov 26, 2002 (gmt 0)

I use WebTrends...


9:41 am on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member chiyo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Yep.. i use a simple spreadsheet at times to get a feel for traffic. Excell for example. Sorting and filtering can get you the info that the dedicated log analysers dont so well. Never thought of using a database like Access but thats a good idea. After deleting all the image hits etc, Escell can still handle thousands of records at a time. So great for small sites like ours with say less than 10,000 "real" page views a day.

After all. log analysers are just a glorified spreadsheets.


9:46 am on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

chiyo, how do you delete image hits and the like?


2:53 pm on Nov 26, 2002 (gmt 0)

10+ Year Member

(original poster) Don't you think there's a market out there for a product that gives you lots of control but can only handle a few thousand hits? Some people don't like messing with Access or Excel. Then again, a product that puts it all into Access or Excel the right way might be nice. I would think people have trouble getting the W3C Extended Log File format into Access. Mine does that but I also added a nice little report designer.

(Disclosure: I made a product out of mine but I know I'm not supposed to sell here. So I won't say what it is. I promise to be good. I think I'm going to like it here.)


3:01 pm on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member chiyo is a WebmasterWorld Top Contributor of All Time 10+ Year Member


just run a filter for lines including the string ".jpg" or ".gif" or ".ico" or ".js" or whatever other extensions are superfluous to your analysis and then delete or exclude them. Will reduce your file size by about 70% to 95% depending on how many images/scripts you have per page.


3:06 pm on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member


No, what I meant is with what method/software do you run that filter? The reason I use Access and not Excel (which would be more practical) is that the file is too big prior to filtering.


3:37 pm on Nov 26, 2002 (gmt 0)

WebmasterWorld Senior Member chiyo is a WebmasterWorld Top Contributor of All Time 10+ Year Member

Ah sorry, I preprocess using a text editor with a macro first if the file is too large to import into a spreadsheet first.


2:47 pm on Nov 27, 2002 (gmt 0)

10+ Year Member

I take the raw log and:
  1. Use the freeware program LookUpIp to change the addresses (e.g. 999.999.999.999 -> www.somesite.com)
  2. Use GREP to remove .jpg .gif .js and robots
  3. Use SED to "repair" Danish and German letters (e.g. %F8 -> )
  4. Look at the log with the freeware application Loggling (remember to delete the lines added by GREP)

To get an overview of the search words and search terms, I use the freeware program Analog.


5:21 pm on Nov 27, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

2buck -

I'm really surprised there not more apps out there. I'v been looking for programs for a while, not found anything that great. The homemade scripts work well for me with Access or SQL as they tend to just log the visits/ uniques, which is all i really want, and can I run SQL queries to find out more info. The problem is the size of the data, and the referalls take some reading as the phrases are just stuck in with all the other stuff.

Personally I dont find webtrends very useful at all, it counts every single server request and can therfore be quite misleading on how buzy the site really is.

To handle bigger files, I have started to write a program in C++ to load referer text files into. One of the hardest parts is to work out the the rules for parsing each of the main search engine querystrings. Do you have some info on this?

The 'origq' is something to add to the list now :) thanks

I have a ton of data going back 3 yrs and hopefully when its done it will give me a concordance of referalls that I can process and will show me really neat stuff like

- Frequency of individual words, phrases
- Difference between search engines
- maybe even help me to guess which words to target :)


5:52 pm on Nov 27, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

I found a compilation of log analzyers [uu.se] by category. I'm most interested in the referral log analyzers as there are lot of ways to see the data I had not considered before. This category I like a lot -- External referring URLs to this site, and their local targets [ktmatu.com].

You can get an idea of which keywords, intended or not, are effective for a specific page.


7:01 pm on Nov 28, 2002 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member

I log all my stuff into a database. Besides setting a cookie for sessions (30 minutes), I also set another cookie for a visitor number, for 1 year.

So i get something like:
visitor session ip request agent referer time

After that queries, views and stored procedures give me more info than all analyzing packages in the world.


Featured Threads

Hot Threads This Week

Hot Threads This Month