Welcome to WebmasterWorld Guest from 184.108.40.206
I ,too, noticed this site today. I set up a bot trap on my site specifically to capture and ban robots that do not follow the robots.txt standard.
The trap sent the following string via e-mail:
A bad robot hit (snipped URL) 2004-10-13 (Wed) 06:54:51 address is 220.127.116.11, agent is Faxobot/1.0
Now if you visit the above mentioned folder you will be banned automatically by my site so please don't click on the link above. ;)
Here's the article that I used to set up the trap, well worth the read for any Web Admin. Examples scripts include a PHP method and a .htaccess method.
It will automate the process of banning bad bots from your site.
[edited by: volatilegx at 2:00 pm (utc) on Oct. 14, 2004]
[edit reason] removed URL [/edit]
The IP Number switch took exactly two seconds.
The banned crawl in which it was being fed 403s, ran between two and eight seconds per file.
After the IP Number switch ( From 18.104.22.168 to 22.214.171.124 ), it ran at roughly the same speed, only this time it requested the same file ( over and over again ) as many as six times over a span of one minute.
All in all, I'd say a ballpark average would be one file ever 2-3 seconds.
And the list goes on. They were all different pages though.
126.96.36.199 - - [21/Dec/2004:14:28:07 -0800] "HEAD /Blahblah.html HTTP/1.0" 403 0 "http://MyRootURL.com/" "Faxobot/1.0"
188.8.131.52 - - [21/Dec/2004:14:28:14 -0800] "HEAD /Blahblah.html HTTP/1.0" 200 0 "http://MyRootURL.com/" "Faxobot/1.0"
Tripped the trap with the first IP Number. Once it got tired of being fed 403s, it switched IP Numbers ( again ) continuing on without incident.