Forum Moderators: bakedjake

Message Too Old, No Replies

Merge multiple files for stats...

How to!

         

Angelis

8:00 am on Aug 23, 2005 (gmt 0)

10+ Year Member



Hi I have my server logs from a 3rd party server which are stored in .gz format.

I am using Awstats on a local machine to analyze the logs. The problem is AW uses a file called access_log which has all the basic information in it.

What I need to do is merge the contents of the .gz files into the access_log file so the stats server can recognise it.

Does anyone know the command in a shell to do this as I have no idea...

Thanks in adv.

wheel

2:18 pm on Aug 23, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In the absence of other replies :), I'd start with something like:

cat logfile.txt>>otherfilename.txt

(don't actually use that command! Cause I doubt it works. However, what you need to look into is the cat command to print the contents of one file, then either the > or the >> to concatenate it onto the end of the other filename. Some combination of commands like that shoudl get you there).

Angelis

3:25 pm on Aug 23, 2005 (gmt 0)

10+ Year Member



It does actually work as I found it out just before you posted your message.

You have to make a blank file:

pico filename.ext

Save it and close it then type:

cat file1.ext file2.ext > newfile.txt

Or you can use * instead to do all files in the directory.

Problem solved.... Thanks :)

MattyMoose

4:10 pm on Aug 23, 2005 (gmt 0)

10+ Year Member



A quick addition: if you don't have pico installed, or don't want to use a text editor to create the file, you can:

touch filename.txt
:: Creates a file with 0 bytes
 echo " " > filename.txt
:: Create a file with a space in it.

The reason I mention the second one is that I've found that some syslog implementations (*cough* Slowlaris *cough*) will not write to a file that has length 0, but will as soon as there's data in it. BTW, I don't like their syslog. :)

On top of all that, if you're doing the cat with >> (redirection), you don't need to do the above. It'll create the file for you.

MM

Angelis

7:50 am on Aug 24, 2005 (gmt 0)

10+ Year Member



That was my initial problem it wouldnt create the file.

I use joe as my editor, dont even have pico installed I just used it as an example :)

ChadSEO

11:15 pm on Aug 24, 2005 (gmt 0)

10+ Year Member



I just wanted to add, AWStats come with a script called logresolvemerge.pl, it's located in the /tools/ directory. It will automatically combine multiple log files for you. In your awstats config file, for the LogFile, you can do things like this:

LogFile="/usr/local/awstats/tools/logresolvemerge.pl /var/log/httpd/example.com.*"

Or my favorite (assuming you're on Linux), to automatically remove an IP address or addresses from the stats:

LogFile="/usr/local/awstats/tools/logresolvemerge.pl /var/log/httpd/example.com.*"¦ grep -vE '(123.123.123.123)¦(123.123.123.124)' ¦"

I use a lot of different configurations like this, to strip out certain IPs, only grab certain files, etc.

Chad

MattyMoose

4:07 pm on Aug 25, 2005 (gmt 0)

10+ Year Member



I just closed my tab by mistake, so I'm a little frustrated... :(

At any rate, I was writing about how we split up our apache log files by day, month and year, so that we can audit our logfiles years later (financial transactions/investigations may require this!). The direcotry structure is a little funny due to older scripts and some automated tools which require some duplicity, and the fact that we have multiple webservers. ;)

The directory structure looks like:

webserver01->01->03->webserver01-combined->webserver01.01.01.03.log
webserver02->01->03->webserver02-combined->webserver02.02.01.03.log
...
webserver01->2005->08->webserver01-combined->webserver01.24.08.05.log
webserver02->2005->08->webserver02-combined->webserver02.24.08.05.log

Here's the script that I wrote to merge all the past month's files together.

 /usr/local/www/awstats/tools/logresolvemerge.pl /webserver0[1-9]/08.05/webserver0[1-9]-combined/* > /http_summaries/monthly.log 

That will merge all the logfiles in:
/webserver01/08.05/webserver01-combined/*
/webserver02/08.05/webserver02-combined/*
...
/webserver09/08.05/webserver09-combined/*

into /http_summaries/monthly.log

You could even do it as a multi-month join by speficying a range for the date directory.

Also, for the IP Address filtering, we filter out our IP ranges in Apache itself to go to an image/junk logfile.

This is accomplished with:


SetEnvIf REMOTE_ADDR ^(127\.0\.0\.1¦192\.168\.2\.¦192\.168\.3\.¦10\.) image-local

CustomLog /www/log/ws01-images combined env=image-local
CustomLog /www/log/ws01-combined combined env=!image-local

Anyway, I know it's a little OT, but I thought I'd share that. :)

MM