homepage Welcome to WebmasterWorld Guest from 54.198.224.121
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL

Search Engine Spider and User Agent Identification Forum

    
What is bot[+:,\.\;\/\\-]?
ichthyous




msg:4440135
 6:08 pm on Apr 12, 2012 (gmt 0)

I haven't seen any reference to this on WebmasterWorld or in google. It also appears as [+:,\.\;\/\\-]bot on my logs. It's requesting hundreds of mb of pages from y site. I am also seeing many other unidentified bots at the same time:

    Unknown robot (identified by 'spider') 5,891+302 192.14 MB 12 Apr 2012 - 13:47
    bot[+:,\.\;\/\\-] 4,800+352 196.00 MB 10 Apr 2012 - 15:42
    Unknown robot (identified by 'crawl') 3,556+109 149.11 MB 12 Apr 2012 - 13:30
    Unknown robot (identified by empty user agent string) 1,655+7 54.83 MB 12 Apr 2012 - 12:52
    Unknown robot (identified by 'bot*') 1,272+106 48.68 MB 12 Apr 2012 - 13:49
    [+:,\.\;\/\\-]bot 1,215+96 34.98 MB 10 Apr 2012 - 15:21
    Unknown robot (identified by hit on 'robots.txt') 0+809 1.60 MB 12 Apr 2012 - 12:53
    Unknown robot (identified by 'robot') 731+37 30.19 MB 12 Apr 2012 - 10:16
    Unknown robot (identified by '*bot') 109+16 4.25 MB 12 Apr 2012 - 10:58
    Unknown robot (identified by 'discovery') 37 634.57 KB 12

Are these spambots or valid bots and how can I stop them? Thanks for any help

 

incrediBILL




msg:4440268
 12:23 am on Apr 13, 2012 (gmt 0)

I've not seen it before, but after the word bot is a regular expression escape string of just the special characters but to what end?

What's the IP of this thing, China?

wilderness




msg:4440273
 12:35 am on Apr 13, 2012 (gmt 0)

FWIW, if you had provided one actual raw visitor log line, you'd have received much quicker help than from the dribble you provide from your stats software.

IP please, and/or actual raw log line!

Mokita




msg:4441257
 5:59 am on Apr 16, 2012 (gmt 0)

@wilderness, some people hosted on shared servers don't have access to raw logs. Others are simply unaware of their existence, or how to download them.

@incrediBILL, There was a thread here a couple of years ago about this that I replied to, but even though I've looked, I can't seem to find it.

@ichthyous, What you are seeing is an artefact of AWStats. If you are able to read your raw log file, you would most probably find that whichever bots it was that visited on 10 Apr 2012 - 15:42 and 10 Apr 2012 - 15:21, they certainly had a normal user-agent, not the regex puzzle quoted by AWStats.

In my case, all the ones I have bothered to follow up have been legitimate bots, and even include Google's AdsBot. It isn't one single bot, its a whole bunch.

Bottom line, ignore it! ;)

Andy Langton




msg:4441367
 9:36 am on Apr 16, 2012 (gmt 0)

There's an older thread here with a slightly more detailed explanation of the awstats regex:

\wbot[\/\-] in AWStats Robots/Spiders visitors
But does not appear on raw log
[webmasterworld.com]

Mokita




msg:4441576
 5:23 pm on Apr 16, 2012 (gmt 0)

Found the earlier thread here:

[webmasterworld.com...]

lucy24




msg:4441689
 10:17 pm on Apr 16, 2012 (gmt 0)

Can't help but feel that
bot\W
would cover the same territory in fewer bytes. Or possibly
bot[^\w\s].

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Search Engines / Search Engine Spider and User Agent Identification
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved