Forum Moderators: DixonJones

Message Too Old, No Replies

First approach to my logfiles

Error is "Parsing lofile" from PHP Cookbook...

         

tomda

6:16 am on Nov 8, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hi,

Yesterday, I just played around with my logfiles.

Firstly, I tried Analog and was not really convinced by the statistic provided by the software (lots of unknown) and strangely, my logfiles have my domaine name at the end of each line which make it unreadable in Analog and therefore it needs extrawork (such as unzipping, opening, replacing and saving)...

******************************************
So, I am willing to write my own log analyser in PHP (much better is it?) and found in my great PHPCookbook resource
a script to parse my logfile, but I got errors.
I would appreciate if someon can have a look to the script below. I am getting this error message ("Undefined offset: 0 in "). Thank you
BTW, I run PHP 4.3.3 and Apache 1.3.27


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html><head>
<style>
table {font-size:10pt;}
</style>
<title>Parse logfile</title>
</head>

<body>
<?
//*********
//LOCAL VAR
//*********
/*
SAVE THESE TWO LINES IN AN EXTERNAL FILES - CALLED TEST_LOG_FILE.TXT AND RUN THE PHPSCRIPT
68.163.36.60 - - [23/Aug/2004:00:46:17 +0200] "GET /news.php HTTP/1.1" 200 12229 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
68.163.36.60 - - [23/Aug/2004:00:46:17 +0200] "GET /news.php HTTP/1.1" 200 12229 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"';
*/

//*******************
// CALL EXTERNAL FILE
//*******************
$fn = "test_log_file.txt";

// OPEN AND READ THE FILE
$fp = @fopen($fn, "r") or die("Can not open $fn");
while ($line=fgets($fp, 1024)) {

//TRIM AND ECHO LINE
$line = trim($line);
echo $line;

//***************
//SPLIT IN ARRAYS
//***************
//THIS IS AN ALTERNATIVE TO THE ORIGINAL SCRIPT
preg_match_all("/^(\S+)\s+(\S+)\s+(\S+)\s+\[(.*)\]\s+\"(.*)\"\s+(\S+)\s+(\S+)$/x", $line, $matches,PREG_SET_ORDER);
//***OR**
//THIS WAS THE ORIGINAL SCRIPT
preg_match("/^(\S+)\s+(\S+)\s+(\S+)\s+\[(.*)\]\s+\"(.*)\"\s+(\S+)\s+(\S+)$/x", $line, $matches);
//***********

echo $matches;

array_shift($matches);

$host=$matches[0];
echo $host;
$identify=$matches[1];
$user=$matches[2];
$time=$matches[3];
$url=$matches[4];
$success=$matches[5];
$bytes=$matches[6];

preg_match("@(..)/(...)/(....):(..):(..)@", $time, $matches);

$day=$matches[0];
$mon=$matches[1] + 1;
$year=$matches[2];
$hour=$matches[3];
$minutes=$matches[4];
$secondes=$matches[5];

Thank you

tomda

8:31 am on Nov 9, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I finally found a script which can parse logfile - script is below. It is working great with my logfile. Just need to update it a bit to handle errors.

Nonetheless, I need more information regarding the retrieved data and data I should ignore/use.

1/ What is the 5-digit number after the Apache error/success (200, 404, 304)? Is it the bytes?

2/ I am planning to put these data in a database in order to do request by IP, date or bots.

I will only add in the db the URL which has "php" or "html" (and therefore ignore all pictures jpeg/gif, css, js loaded) - Problem: I would not be able to see hotlinking. What do you think?

Also should I put the HTTP value and GET/POST value in my database? Are they very useful?

Thank you for helping me out.

Here is the script for those who may be interested
***************


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<html>
<head>
<style>
table {font-size:10pt;}
</style>

<title>Untitled</title>
</head>
<body>

<?

echo "<table border=1><tr align=center><td>IP/HOST</td><td>IDENTITY</td><td>USER</td><td>DATE</td><td>?</td><td>PAGE VIEW</td><td>?</td><td>?</td><td>ERROR<br>SUCCESS</td><td>BYTES</td><td>COME FROM...</td><td>BOTS</td></tr>";

//*********
//LOCAL VAR
//*********

/*
SAVE THESE TWO LINES IN AN EXTERNAL FILES - TEST_LOG_FILE.TXT AND RUN THE SCRIPT
68.163.36.60 - - [23/Aug/2004:00:46:17 +0200] "GET /news.php HTTP/1.1" 200 12229 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"
68.163.36.60 - - [23/Aug/2004:00:46:17 +0200] "GET /news.php HTTP/1.1" 200 12229 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"';
*/

//*******************
// CALL EXTERNAL FILE
//*******************
$fn = "test_log_file.txt";
// OPEN AND READ THE FILE
$fp = @fopen($fn, "r") or die("Can not open $fn");
while ($line=fgets($fp,1024)) {
//TRIM AND ECHO LINE
$line = trim($line);

if (preg_match('!^([^ ]+) ([^ ]+) ([^ ]+) \[([^\]]+)\] "([^ ]+) ([^ ]+) ([^/]+)/([^"]+)" ([^ ]+) ([^ ]+) ([^ ]+) (.+)!',
$line,
$elements))
{

if (preg_match('/php/',$elements[6]) ¦¦ preg_match('/php/',$elements[6])) {
echo "<tr align=center><td>".$elements[1]."</td><td>".$elements[2]."</td><td>".$elements[3]."</td><td>".$elements[4]."</td><td>".$elements[5]."</td><td>".$elements[6]."</td><td>".$elements[7]."</td><td>".$elements[8]."</td><td>".$elements[9]."</td><td>".$elements[10]."</td><td>".$elements[11]."</td><td>".$elements[12]."</td></tr>";}
// print_r($elements);
} }
echo "</table>";
?>
</body></html>