Forum Moderators: open
I just noticed something strange. All has been quiet then all of a sudden I get slammed with tons of Wget requests from 10 different IP's, mostly from universities around the world. They are all .htaccess banned now..however I was wondering what are all your thoughts on why this happened?
PS. I'm using Froggyman's ban_bot.cgi script..but it Wget is ruthless in not giving up.
Or, if the requests are coming one after another it could be a script using wget via proxies. Unfortunately, wget has the ability to change it's user agent, so just banning the UA may not stop it if the guy on the other end is clever.
The machine is acting as a client for a distributed search engine, and
is crawling sites sent down to it from a central server. You can hit the
web site of the project at www.grub.org, and probably submit your URL to
be placed on a "Do not crawl" list *grin* You might want to email Kord
(the head of the project) with any suggestions as to throttling and
such.
So I e-mailed them at support@grub.org. :)
This seems to explain my problem in the first post..