I am interested what you are using for analysis of Apache access log or what you recommend. I am interested mostly in Linux programs but Windows suggestions are welcome too. Also it would be great to list both GUI (offline) and web (online) solutions. The idea is to be able to parse and read logs faster and get reports that you need like bots / crawlers / spiders visits etc (just example).
So here is what I found during research:
[GUI]KSystemLog - comes with KDE, allows reading Apache access logs. I had only one problem with it on Kubuntu 12.04, it didn't load date from log but instead showed 1AM for all entries
[GUI]GoAccess - it is more like htop version of access log. Comes in Debian 7+
[WEB]Visitors - nice web reports, loads data real time from access log
[GUI]Adscon Log Analyzer - looks like powerful search tool through various logs. Maybe an overkill for just going through access log and hunting bots
[GUI]Apache Log Viewer - this one is rather great tool. It has filters, reports... Is there something similar for Linux too?
Msg#: 4661391 posted 2:01 am on Apr 9, 2014 (gmt 0)
welcome to WebmasterWorld, mk123!
the generalized tools are fine for monitoring trends and finding any outliers, but usually you will have to dig deeper to find out what's happening in a specific situation. for example, a scraper attack which may last for several days and span several IP addresses will be difficult to analyze using one of the generalized tools.
i do most of my ad hoc apache access log analysis using the following tools which are available on any *nix-flavored OS: grep/egrep cut sort uniq
and occasionally i use: sed
and more rarely i will have use: awk
that covers 99% of my needs. anything else can be easily accomplished with a perl script. by now, i usually have something around that can be easily reused/rewritten for a specific task.
Thanks for welcome phranque. Tools you mentioned is what every admin has in his toolbelt each time it needs quickly to find out some information. So for those +1 naturally.
Thanks lucy24, I am leaning towards logstash plus custom script to make things run nice.
Edit: as for Piwik, I am not sure if it can help much about monitoring for example how many times and which robots are hitting my site, did they follow robots.txt and similar. While complete monitoring is useful naturally, right now I am tuning out my tools in combat with all those bad bots out there that sometimes have no mercy at all.