Since there is nothing official in the Apache codes to tell spiders they are completely unwelcome and should NEVER return, I propose a new result code '666' to tell unwanted crawlers to simply 'Go To Hell' and never come back :)
I'm only half joking here with the 666 number as many sure don't seem to understand that 403 forbidden on the root of the domain means they are forbidden from the entire domain.
Some also don't seem to get that being blocked in robots.txt also means keep out.
In particular, I'm getting a small attack from what appears to falsely claim to be
"ia_archiver" from China [webmasterworld.com] lately and it's being blocked both in robots.txt and gets a 403 forbidden from the entire domain yet it keeps cranking up the number of requests per day. Not like it's a real DDoS or anything, but the volume and number of new IPs it's coming from daily is quite distressing as it appears it could become a real problem. Obviously I could just drop China in the firewall on the server, which I've done on other servers, but I'm trying to avoid that on this particular box.
Anyway, I'm thinking we need to propose a new code that literally states in no uncertain terms "GO AWAY, STAY AWAY, AND NEVER RETURN". Wondering if any of them honor "retry-after" as I could give them a nice retry number like "31536000" which would be about a year.
In the "amusing myself" further category, I'm also considering trying a "402 Payment Required" with instructions and a link to PayPal and see if anyone ever pays to gain access bad enough to pay for it. Maybe offer access at the rate of $0.01 per page for a $1 access payment, or 3 pages per $0.01 for a $10 payment, etc. IMO this is a far better solution than a 403 or my proposed 666 even. If they want it, pay to get it, or go away :)