Forum Moderators: open
It's also likely that this is one of the catch-all "filters" at the end of the list, and that it wouldn't actually be applied to *known* googlebots or msnbots, but rather be used to catch second- and third-tier 'bots.
The major 'bots would probably be detected and reported specifically (depending on how your User-agent string filters are configured) before this catch-all pattern is applied.
Jim
[edited by: jdMorgan at 2:48 am (utc) on Jan. 20, 2009]
Few things to keep in mind with awstats:
- Bot being recognized does not mean it is a good one.
- Bot being recognized does not mean it is not being spoofed by someone else.
- Bot being recognized means if you're only monitoring visitor number of hits, a recognized bot could reap a lot of pages unnoticed, you need to regularly keep an eye on "Not viewed traffic" or "Robots/Spiders visitors" too.
Indeed, I generally flip straight through to the 'unrecognised user agents' lists, for exactly the reasons that you mention. It's the fact that this appears on the main page top ten of recognised user agents that made me raise this query. As I understand it, if AWStats recognises (or thinks it recognises, in the case of spoofs) a UA, then it lists it here. Usually in a readable format.
[edited by: bouncybunny at 3:29 pm (utc) on Jan. 20, 2009]
So, my theory is that seeing this in your stats indicates that you will find "<something>bot/" or "<something>bot-" in your raw logs, and that this is a "catch-all" pattern which likely follows the other more-specific patterns meant to detect known major search engine robots.
Now why AWStats is showing the regular-expressions pattern instead of a clearer message such as "Miscellaneous unrecognized robots" might be a good question for the developers or for the AWStats support forum. If this is a new behavior, it might have been caused by a bug in a recent upgrade, so it's worth checking out.
Jim