homepage Welcome to WebmasterWorld Guest from 54.226.235.222
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Local / Foo
Forum Library, Charter, Moderators: incrediBILL & lawman

Foo Forum

    
Watching webserver logs
where does the time go?
bonanza




msg:280389
 6:53 pm on May 7, 2003 (gmt 0)

Does anybody else here do a 'tail -f access_log' on their webserver and watch the hits stream by like the computer screen on the Matrix?

Or is it just me? :)

 

ritualcoffee




msg:280390
 6:54 pm on May 7, 2003 (gmt 0)

that's a great way to get high blood pressure. ;)

ncw164x




msg:280391
 8:20 pm on May 7, 2003 (gmt 0)

Yes bonanza I do it to check what's hitting the sever and to find out what new UA or spiders are out there. You start by only doing 10 minutes and next thing you know an hours has passed by

jpjones




msg:280392
 9:32 pm on May 13, 2003 (gmt 0)

My name is jpjones and I'm a log-watchaholic.

I'm afraid I'm frequently guilty of this. I have a dual-monitor setup at work, and one screen frequently has several windows showing the live logs of various webservers.

Interesting to watch, and I sometimes go as far as to filter it to show only web page requests, and then follow visitors progress live through the website. Useful to watch when you've added a special offer to the homepage, and just for gauging surfers' habits.

jcoronella




msg:280393
 12:14 am on May 14, 2003 (gmt 0)

tail -f access_log ¦grep googlebot

mischief




msg:280394
 4:12 am on May 14, 2003 (gmt 0)

Hmm... I don't know about being a log-watchaholic, but I guess I spend a fair amount of time logwatching. First I started off with a simple tail -f, but there's too much uninteresting (therefore annoying) stuff flowing by, so I changed it to tail -f logfile ¦ grep -v boring ¦ grep -v uninteresting. But that kind of sucks, so I started using egrep: tail -f logfile ¦ egrep -v '(boring¦uninteresting¦nono¦yawn)'. That was quite nice for a while, until I found myself with multiple accounts that logs were stored on. I'd make a small change to filter something new on one box, then expect it to be on another box, find it wasn't, not remember which machine it was on so redo it... very inefficient. So I made it an alias in my dotfiles that used an environment variable as the regex to filter out - that way I can keep the same alias everywhere, and just copy the regex to my bash_profiles. The problem was that the log files were in different places on different boxes, so I ended up making another var for the location of log files, which was specific to each host. Then of course there's some bits from each log file that are different between servers, so the obvious solution was to split $LOG_IGNORE_REGEX into $LOG_IGNORE_REGEX_COMMON and $LOG_IGNORE_REGEX_LOCAL (not the actual names, but similar). This was also nice for a while, but add just one more web server and it started getting difficult keeping things synchronised. No problem really though - a couple of little bash functions to distribute dotfiles between servers would solve it, of course with a few more environment variables to make sure each machine distributes to the right places and not itself. I think I'm sorted now though.

Of course, playing with Apache directives like LogFormat, CustomLog, and SetEnvIf makes things more interesting. It's really easy to get Apache to filter out some of the stuff you don't really want to see anyway. There's always seems to be some time when you think "tch, I wish I had the old log file right now", though. The simplest solution is just to have Apache log to both files, so you can pick which one is more appropriate. 'Course I do find it a pain having to su, edit the log file, and restart every time I want to make a small change though... well, I used to, but I figured it would make it simpler to just set up Apache to pipe logs to shell script that filters stuff out. No need for root then. Just a couple more entires in httpd.conf, maybe another environment variable somewhere so I can get to the right log file on each server, and the script itself. But that's it! I'm definitely satisfied now. Although this script is getting a bit long, things keep popping up that I want to filter out, so maybe it would be better to rewrite it in Perl. Silly to use shell in the first place really, Perl was the obvious choice anyway. Shouldn't take long to convert it... But maybe I should get that javascript working in IE properly first, it must be something simple. All I want to do is have the browser scroll to the bottom of the screen automatically, so my PHP script can keep feeding it lines from the log file it's reading. Works in Opera fine! I wouldn't bother with IE ordinarily, but funny thing is a guy I work really loves watching those lines go by and I could get a couple of beers out of it I reckon. Odd, he's almost, I dunno, like he's got log files on the brain or something!

Clark




msg:280395
 4:23 am on May 14, 2003 (gmt 0)

I can see why your nick is mischief. WOW!

ncw164x




msg:280396
 12:37 pm on May 14, 2003 (gmt 0)

mischief... I take you won't be attending the annual logwatching convention then, where 1000's of people meet up and exchange lines of text

mischief




msg:280397
 1:21 pm on May 14, 2003 (gmt 0)

No, I'm busy this year (got some scripts I need to get finished...).

dmorison




msg:280398
 1:59 pm on May 14, 2003 (gmt 0)

Real time log watching is a great way to learn how users view your site. I too have a dual monitor set-up, and as I see a visitor come in I will often fire up the website on my own system and actually navigate in synchronisation with the visitor just to see how much of a page could be taken in during the time they spent there.

I'm "lucky" in that my commercial site has steady but not overwhelming traffic, such that real time log watching is possible - there is rarely more than one person on the site at a time.

I do a couple of things with cookies and Apache configuration to make this a more manageable process.

Firstly, I have a "secret" page on my site that sets a cookie called "nolog". I then have a bunch of SetEnvIf lines in httpd.conf that decide whether or not to log a request.

They first line is:

SetEnvIfNoCase Request_URI .* normalrequest=true

Which sets an environment variable "normalrequest" to start off with the "log everything" case.

Then...


SetEnvIfNoCase Request_URI \.gif$!normalrequest
SetEnvIfNoCase Cookie .*nolog.*!normalrequest

...which clear "normalrequest" if the request is for a .gif image or from me (because of the nolog cookie). Similar lines kill logging for .css, bots etc.

This is used together with...


CustomLog [existing parameters] env=normalrequest

Probably an easier way to do all that using .htaccess and subdirs. and what not, but it only took a few minutes to setup.

Since developed a impulsive glance towards my logging monitor whenever my peripheral vision detects a shift up the screen!

mischief




msg:280399
 1:44 am on May 13, 2003 (gmt 0)

This is pretty similar to what I have:

SetEnvIfNoCase Request_URI "\.(gif¦jpeg¦jpg¦png¦css¦js¦swf¦pdf)$" notapage
SetEnvIfNoCase Remote_Addr "127\.0\.0\.0" notapage

With "127.0.0.1" being the IP of my workstation.

(I've realised that this is the wrong way to do it, though - it would be much easier to just set what file types are pages, rather than all those that aren't.)

Then it just pipes the results to a script:

CustomLog ¦/path/to/docroot/write-log.sh watching env=!notapage

"write-referer-log.sh" just reads the line in, checks if it matches some stuff I'm not interested in, and writes a line to the log if not. It kind of sucks because the script has to be loaded and executed (which in turn opens the log file, writes a line, and closes it) for every hit. It's not like the server's falling over or anything, but it's sort of offensive to anyone who cares about efficiency.

It would be nice to have another monitor though! I've developed a sort of nervous twitch - if I don't flip to the window with logs open once every couple of minutes it feels like something's not quite right...

jesserud




msg:280400
 6:22 pm on May 13, 2003 (gmt 0)

Damn you guys for introducing me to this awesome time suck.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Local / Foo
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved