Forum Moderators: open
Since the 9th of this month one of my sites is being crawled by a bot which my awstats identifies like this :
Unknown robot (identified by 'spider') .. on 09 Oct 2003 - 17:35
Unknown robot (identified by 'robot') .. on 18 Oct 2003 - 16:26
Unknown robot (identified by 'crawl') .. on 20 Oct 2003 - 05:14
thanks for any help.
Cheers
Total of 87 hits in under 2 minutes (mostly images). Took the front page of the site with images (including rollover images), then a few other pages. Two pages were requested twice. No request for robots.txt.
Looks like a browser to me.
Report abuse: abuse@tpg.com.au
If it's a regular surfer you will see a hit in your logs for each graphic on the page. Did he pull any graphics outside the few pages accessed?
Since AWStats doesn't know about every bot that there is, the developer has taken a pretty neat route and analyzes the UA for some keywords - "spider", "robot" and "crawl" - and tries to identify (unknown) bots that way.
To use one (common) example, AWStats doesn't identify LookSmart's "grub" crawler by name, but because of the use of the word "crawl" in the UA, AWStats catches grub that way...
Mozilla/4.0 (compatible; grub-client-1.5.3; Crawl your own stuff with http*://grub.org)