Forum Moderators: open

Message Too Old, No Replies

\wbot[\/\-] in AWStats Robots/Spiders visitors

But does not appear on raw log

         

bouncybunny

9:42 am on Jan 19, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



My AWStats "Robots/Spiders visitors (Top 25)" list has a robots listed as:

\wbot[\/\-]

This would indicate that AWStats recognises this bot. But I cannot find any reference to it in my logs.

Anyone got any ideas what it is?

bouncybunny

12:11 am on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I should add that this month is the first time I have noticed it and it seems to have appeared several dozen times.

jdMorgan

2:06 am on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It is quite likely that "\wbot[\/\-]" is not a literal user-agent string, but rather a regular-expressions pattern meaning "match any string containing a word (alphabetic string) that ends with 'bot' followed by a "/" or a hyphen" such as "Googlebot/2.1", "Googlebot-Mobile", "msnbot/1.1", or "msnbot-media/1.0"

It's also likely that this is one of the catch-all "filters" at the end of the list, and that it wouldn't actually be applied to *known* googlebots or msnbots, but rather be used to catch second- and third-tier 'bots.
The major 'bots would probably be detected and reported specifically (depending on how your User-agent string filters are configured) before this catch-all pattern is applied.

Jim

[edited by: jdMorgan at 2:48 am (utc) on Jan. 20, 2009]

bouncybunny

5:23 am on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Jim

Not sure I completely understand all of that, but I get the gist of it.

It just seemed odd that AWStats was suddenly reporting this in its list of 'known spiders'. Normally it would stick this in its, rather longer list, of unknowns.

Hobbs

3:10 pm on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



In simpler terms: If you search for just wbot you might find it in your raw logs.

Few things to keep in mind with awstats:

- Bot being recognized does not mean it is a good one.
- Bot being recognized does not mean it is not being spoofed by someone else.
- Bot being recognized means if you're only monitoring visitor number of hits, a recognized bot could reap a lot of pages unnoticed, you need to regularly keep an eye on "Not viewed traffic" or "Robots/Spiders visitors" too.

bouncybunny

3:26 pm on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



First thing I did was search for 'wbot' in the logs. It's not there.

Indeed, I generally flip straight through to the 'unrecognised user agents' lists, for exactly the reasons that you mention. It's the fact that this appears on the main page top ten of recognised user agents that made me raise this query. As I understand it, if AWStats recognises (or thinks it recognises, in the case of spoofs) a UA, then it lists it here. Usually in a readable format.

[edited by: bouncybunny at 3:29 pm (utc) on Jan. 20, 2009]

jdMorgan

3:27 pm on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



In regex, "\w" means "any word character" and is equivalent to "[0-9A-Za-z_]", so the "w" is not a literal here, it's a regex token.

So, my theory is that seeing this in your stats indicates that you will find "<something>bot/" or "<something>bot-" in your raw logs, and that this is a "catch-all" pattern which likely follows the other more-specific patterns meant to detect known major search engine robots.

Now why AWStats is showing the regular-expressions pattern instead of a clearer message such as "Miscellaneous unrecognized robots" might be a good question for the developers or for the AWStats support forum. If this is a new behavior, it might have been caused by a bug in a recent upgrade, so it's worth checking out.

Jim

bouncybunny

3:30 pm on Jan 20, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks Jim, I didn't realise that they had a forum, I'll head over there when I have some time.