Forum Moderators: coopster & phranque

Message Too Old, No Replies

Regular expressions / Apache logs

Driving me up the wall!

         

sugarkane

7:58 am on Apr 2, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm trying to parse Apache logfiles, and having trouble splitting the user agent into a variable. The log is in the standard 'combined' format, and this is the code I'm using:

open (FP,"/root/logfile");
while(<FP>) {
($host,$rfc931,$user,$date,$request,$URL,$status,$bytes,$refer,$ua) = /^(\S+) (\S+) (\S+) \[([^]]+)\] "(\w+) (\S+).*" (\d+) (\S+) "(\S+)" ".*"/;
}
close(FP);

I can match every other field in the log, but $ua is always empty.

Any ideas?

Brett_Tabke

12:50 pm on Apr 2, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



<- no regex expert.

Which means I usually do it in steps. Normally, I'm not going to use every field, so breaking it down a few at a time isn't a big problem.
(line is the log line)...

($Domain,$rfc931,$authuser,$TimeDate,$Request,$Status,$Bytes,$Referrer,$Agent) = $line =~ /^(\S+) (\S+) (\S+) \[([^\]\[]+)\] \"([^"]*)\" (\S+) (\S+) \"?([^"]*)\"? \"([^"]*)\"/o;

sugarkane

2:25 pm on Apr 2, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Brett, that works fine. Now, if I can only decipher how it works... ;)

How does the 'o' modifier at the end effect things?

Brett_Tabke

8:05 pm on Apr 2, 2001 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



The o keeps it from backtracking and reanalyzing the line if it can't find a match. Really speeds things up, but in this case it is a left over of older code as /o only is effective if the regex contains a variable.

See Perl FAQ 6 "What is /o really for?"