Forum Moderators: coopster

Message Too Old, No Replies

Breaking active log files into chunks

         

ocon

10:36 pm on Aug 30, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



I have a text file that logs system information. What I'm trying to do is to create a script that will run periodically to read, parse and remove this information and submit it to an API for statistical analysis.

The limits of the API are I can only submit 500 records in any batch. And I'm also concerned about the log being written to while I'm simultaneously trying to process the file.

I've created this script below that works, but it seems slow and clunky, and I'm wondering if anyone would be able to make some suggestions on how I can improve it.

$file = "myfile.txt";// Location of my file
$data = explode("\n", file_get_contents($file));// Read the file contents
unlink($file);// Immediately delete the file so new entries written to it are separate than the entries stored in the memory of this script
unset($data[count($data)-1]);// Removes last blank line

$hopper = count($data);

for($group = ceil($hopper/500); $group > 0; $group--){

$insert = "";
$lineCount = ($hopper > 500) ? 500 : $hopper;

for($lines = $lineCount; $lines > 0; $lines--){

$part = explode(",", $data[($lines-1)]);
$insert .= "... {$data[0]} ... {$data[1]} ...";// Creates a formatted version I can send to the api
$hopper--;}

... // Sends this batch to the API
}

ocon

11:34 pm on Aug 30, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



I can control how my logs are written, and I was thinking about making them in the formatted version with a consistent length and no line breaks.

That would allow me to read the first x-bytes of data and send it to the API, and save the rest for the next time it runs, requiring very little processing. If the log starts getting pretty big I can set the cron job to run it more frequently.

I like this idea, but the thing I'm most uncomfortable with it is reading what may be a massive file to string, and then cutting it up, and then most of it back to file. Is there a better way to do this?

$file = "myfile.txt";
$data = file_get_contents($file);

$entry_length = 123;
$process = 500;
$bytes_to_process = $entry_length * $process;

$fh = fopen($file, "w");
fwrite($fh, substr($data, $bytes_to_process));
fclose($fh);

$insert = substr($data, 0, $bytes_to_process);
... send $insert string to the API

brotherhood of LAN

11:45 pm on Aug 30, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you're on a UNIX system, split [unixhelp.ed.ac.uk] may be ideal.

I use it to split large inserts (millions) into a DB into smaller chunks (thousands).