Forum Moderators: open
The List
Slurp : Inktomi.com
GoogleBot : google.com
Scooter : altavista.com
DirectHit : directhit.com
Fast : alltheweb.com
teoma : teoma.com
ArchitextSpider : excite.com
Gulliver : northernlight.com
T-Rex : Lycos.com
The bot is 202.212.5.32 -> goo311.inktomi.com
The requests come in like this:
at 1:59:58 PM on Monday, June 9, 2001
at 2:00:00 PM on Monday, June 9, 200
at 2:00:01 PM on Monday, June 9, 2001
at 2:00:02 PM on Monday, June 9, 2001
at 2:00:02 PM on Monday, June 9, 2001
at 2:00:03 PM on Monday, June 9, 2001
at 2:00:04 PM on Monday, June 9, 2001
at 2:00:06 PM on Monday, June 9, 2001
at 2:00:06 PM on Monday, June 9, 2001
at 2:00:08 PM on Monday, June 9, 2001
at 2:00:09 PM on Monday, June 9, 2001
at 2:00:10 PM on Monday, June 9, 2001
at 2:00:11 PM on Monday, June 9, 2001
at 2:00:12 PM on Monday, June 9, 2001
at 2:00:14 PM on Monday, June 9, 2001
at 2:00:15 PM on Monday, June 9, 2001
at 2:00:16 PM on Monday, June 9, 2001
and on, and on...
Adding up to tens of thousands of requests per day per server.
This version is like a virus. Sometimes it will grab pages on one domain with only 2 seconds in between, BUT they(Slurp/cat) are off multiple IPs on the same Inktomi C-block. Just goes to show how much they coordinate with one another. Unless, they are running off of separate lists of URLs from their dozen or so databases. But even so that could still clog up a server.
So for almost two or more weeks this lil bugger wouldn't even try to glance at the robots.txt file. After that, if you don't have it disallowed in the robots.txt file, it takes a lil break for a week and starts all over again.