webalizer or other analyzer
hi, i'm have 20 virtual hosts (apache) on my server.
All of them have personal log file.
I want to see summary stats for all hosts, but i dont know how to do this with webalizer. For each one it simple, but not very useful for summary statistics.
Judging by the FTP timestamps, webalizer appears not to have been updated since April 2003. Four and a half years is an age in "internet time"... so how do you expect a program four years out-of-date to give you accurate stats - particularly on robots/scrapers, current search engine bots, and so on?
Perhaps it's time to look for a more up-to-date piece of software?
In regards to ...
... Webalizer uses the CLF (Common Log Format [apache.org]) among others -- so the data itself is coming right from your server logs, you can't be more accurate than that. The stats/data itself is not out of date, therefore neither are the stats/information as it is being reported. If the presentation of the data is not satisfactory, that is a different story. However, that is why the raw logs are there. Also, Webalizer is open source, so you can modify the reports as you wish. If you aren't a programmer, Webalizer does provide some configuration files that allow you to do some manipulation.
|the data itself is coming right from your server logs, you can't be more accurate than that. |
The number of pageviews is 100% accurate, but aren't you interested in the breakdown between humans/bots/search engines?
I certainly am!
|The stats/data itself is not out of date, therefore neither are the stats/information as it is being reported. |
1. I need to know the number of human visitors to my site.
2. I want to know whether bots are pulling significant numbers of pages from my site.
3. I need to know whether the search engines are spidering my site.
If webalizer hasn't been updated for 4.5 years, it can't have an up-to-date list of bots, so item (2) is flawed, it won't have an accurate list of search engines, so (3) will be flawed, which means that (1) is also flawed as it will be over-estimated.
|If the presentation of the data is not satisfactory, that is a different story. |
IMHO this is nothing to do with presentation, it's purely about numbers. webalizer may be good enough to calculate [human_pageviews] + [bot_pageviews] + [se_spider_pageviews] as an overall total but this is so trivial I reckon I can do that faster with something like
cat www.example.com.log ¦ wc -l
I didn't think Webalizer ever recognised the difference between people and bots ... nor does it ever show useful info like complete URL of refeering page / search engine, etc...
I have Webalizer and AWStats available to me ... I compared the results over a few months and AWStats seems to be more consistent and mcuch more informative.
|I didn't think Webalizer ever recognised the difference between people and bots |
Five years ago nobody worried about bots. Well, perhaps one person did [webmasterworld.com] ;-)
In essence, this is my point. Look how fast software changes. Look at what happens in five years.
How can a five year old piece of software be a good fit for today's problems?
Some of the data may be out of date, e.g. IP addresses that were allocated to different organizations when the logs were created.
|Dabu The Dragon|
If you can get your hands on a rawlog cruncher you can pull the data of choice. I don't think the old hosted version of application will allow you to customize reports.