The robot from bundy.infomaxinc.com accessed two of my domains. It did not request robots.txt and was denied further access by requesting my ban script which is linked from "hidden" anchors. REMOTE_HOST: 18.104.22.168 (bundy.infomaxinc.com) HTTP_USER_AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5b) Gecko/20030722 Mozilla Firebird/0.6Connection: close 22.214.171.124 - - [22/Aug/2003:20:24:06 -0400] "GET [example.com...] HTTP/1.0" 200 8506 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5b) Gecko/20030722 Mozilla Firebird/0.6Connection: close"
The REMOTE_HOST and HTTP_USER_AGENT are now permanently denied access by .htaccess directives.
"Connection: close" is appended to the HTTP_USER_AGENT string. The string should end immediately following the Mozilla Firebird version number.
The request for the domain root Index is logged as "GET [example.com...] HTTP/1.0". Legitimate requests from Mozilla Firebird for the domain root Index are logged as "GET / HTTP/1.1" or "GET /index.html HTTP/1.1".
Inline image files were not requested.
Documents linked in "hidden" anchors were requested.
The remote user is accused of mining content from my web site and I believe that the evidence presented above is sufficient to bring in a guilty verdict.
Please note that I am not denying access for requests from Mozilla Firebird. I am denying access for request from the HTTP_USER_AGENT exactly as recorded in my log file.
Also, another bot came in June from berg.dbsmarketing.net which resolved to 126.96.36.199 and is in the same IP block (listed in SPEWS and assigned to cybercon.com). It exhibited the same behavior.
I posted about both of them back in June in dbsmarketing.net & infomaxinc.com [webmasterworld.com]. That post included some research that showed the IPAs were listed in SPEWS. The post is still there but for some reason it does not turn up in a site search for "infomaxinc.com."
The visits by these two bots was what finally got me motivated to install an automated bot trap.