|Watching webserver logs|
where does the time go?
| 6:53 pm on May 7, 2003 (gmt 0)|
Does anybody else here do a 'tail -f access_log' on their webserver and watch the hits stream by like the computer screen on the Matrix?
Or is it just me? :)
| 6:54 pm on May 7, 2003 (gmt 0)|
that's a great way to get high blood pressure. ;)
| 8:20 pm on May 7, 2003 (gmt 0)|
Yes bonanza I do it to check what's hitting the sever and to find out what new UA or spiders are out there. You start by only doing 10 minutes and next thing you know an hours has passed by
| 9:32 pm on May 13, 2003 (gmt 0)|
My name is jpjones and I'm a log-watchaholic.
I'm afraid I'm frequently guilty of this. I have a dual-monitor setup at work, and one screen frequently has several windows showing the live logs of various webservers.
Interesting to watch, and I sometimes go as far as to filter it to show only web page requests, and then follow visitors progress live through the website. Useful to watch when you've added a special offer to the homepage, and just for gauging surfers' habits.
| 12:14 am on May 14, 2003 (gmt 0)|
tail -f access_log ¦grep googlebot
| 4:12 am on May 14, 2003 (gmt 0)|
Hmm... I don't know about being a log-watchaholic, but I guess I spend a fair amount of time logwatching. First I started off with a simple tail -f, but there's too much uninteresting (therefore annoying) stuff flowing by, so I changed it to tail -f logfile ¦ grep -v boring ¦ grep -v uninteresting. But that kind of sucks, so I started using egrep: tail -f logfile ¦ egrep -v '(boring¦uninteresting¦nono¦yawn)'. That was quite nice for a while, until I found myself with multiple accounts that logs were stored on. I'd make a small change to filter something new on one box, then expect it to be on another box, find it wasn't, not remember which machine it was on so redo it... very inefficient. So I made it an alias in my dotfiles that used an environment variable as the regex to filter out - that way I can keep the same alias everywhere, and just copy the regex to my bash_profiles. The problem was that the log files were in different places on different boxes, so I ended up making another var for the location of log files, which was specific to each host. Then of course there's some bits from each log file that are different between servers, so the obvious solution was to split $LOG_IGNORE_REGEX into $LOG_IGNORE_REGEX_COMMON and $LOG_IGNORE_REGEX_LOCAL (not the actual names, but similar). This was also nice for a while, but add just one more web server and it started getting difficult keeping things synchronised. No problem really though - a couple of little bash functions to distribute dotfiles between servers would solve it, of course with a few more environment variables to make sure each machine distributes to the right places and not itself. I think I'm sorted now though.
| 4:23 am on May 14, 2003 (gmt 0)|
I can see why your nick is mischief. WOW!
| 12:37 pm on May 14, 2003 (gmt 0)|
mischief... I take you won't be attending the annual logwatching convention then, where 1000's of people meet up and exchange lines of text
| 1:21 pm on May 14, 2003 (gmt 0)|
No, I'm busy this year (got some scripts I need to get finished...).
| 1:59 pm on May 14, 2003 (gmt 0)|
Real time log watching is a great way to learn how users view your site. I too have a dual monitor set-up, and as I see a visitor come in I will often fire up the website on my own system and actually navigate in synchronisation with the visitor just to see how much of a page could be taken in during the time they spent there.
I'm "lucky" in that my commercial site has steady but not overwhelming traffic, such that real time log watching is possible - there is rarely more than one person on the site at a time.
I do a couple of things with cookies and Apache configuration to make this a more manageable process.
Firstly, I have a "secret" page on my site that sets a cookie called "nolog". I then have a bunch of SetEnvIf lines in httpd.conf that decide whether or not to log a request.
They first line is:
|SetEnvIfNoCase Request_URI .* normalrequest=true |
Which sets an environment variable "normalrequest" to start off with the "log everything" case.
SetEnvIfNoCase Request_URI \.gif$!normalrequest
SetEnvIfNoCase Cookie .*nolog.*!normalrequest
...which clear "normalrequest" if the request is for a .gif image or from me (because of the nolog cookie). Similar lines kill logging for .css, bots etc.
This is used together with...
CustomLog [existing parameters] env=normalrequest
Probably an easier way to do all that using .htaccess and subdirs. and what not, but it only took a few minutes to setup.
Since developed a impulsive glance towards my logging monitor whenever my peripheral vision detects a shift up the screen!
| 1:44 am on May 13, 2003 (gmt 0)|
This is pretty similar to what I have:
SetEnvIfNoCase Request_URI "\.(gif¦jpeg¦jpg¦png¦css¦js¦swf¦pdf)$" notapage
SetEnvIfNoCase Remote_Addr "127\.0\.0\.0" notapage
With "127.0.0.1" being the IP of my workstation.
(I've realised that this is the wrong way to do it, though - it would be much easier to just set what file types are pages, rather than all those that aren't.)
Then it just pipes the results to a script:
CustomLog ¦/path/to/docroot/write-log.sh watching env=!notapage
"write-referer-log.sh" just reads the line in, checks if it matches some stuff I'm not interested in, and writes a line to the log if not. It kind of sucks because the script has to be loaded and executed (which in turn opens the log file, writes a line, and closes it) for every hit. It's not like the server's falling over or anything, but it's sort of offensive to anyone who cares about efficiency.
It would be nice to have another monitor though! I've developed a sort of nervous twitch - if I don't flip to the window with logs open once every couple of minutes it feels like something's not quite right...
| 6:22 pm on May 13, 2003 (gmt 0)|
Damn you guys for introducing me to this awesome time suck.