Welcome to WebmasterWorld Guest from 50.16.24.12

Forum Moderators: Ocean10000 & incrediBILL & phranque

200 vs 403 status codes for bot attacks

   
9:02 pm on Apr 19, 2013 (gmt 0)



Hi,
did anyone consider a scenario, when server responds to attacker with fake 200 while defacto performing 403 action?
If there are no any danger of DDoS etc, then why I should inform this bad guy he is banned? Let him errs...
10:07 pm on Apr 19, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Are you talking about something like rewriting to a bad-robots page instead of smacking them with an explicit 403? Like the way many people deal with hotlinks, only now we're talking about pages instead?

I don't know if anyone has done exhaustive testing. It probably depends on the robot. And, of course, a rewrite just isn't as emotionally satisfying as an in-your-face 403 ;)
12:05 am on Apr 20, 2013 (gmt 0)



Simply I do not want to promote an evolution of those... bots:)
We all do know what the evolution can achieve:) Let this process will be slower...
Is this of utter importance "to be honest" in the case of abusing your site?
Did anyone try to substitute 403->200 for the requesting bot only (foreign affairs:), prohibiting it at the same time (internal affairs:).
2:07 am on Apr 20, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



I really think it depends on the robot. I've met some that ran away in a huff when I started redirecting them to 127.0.0.1. Other robots simply don't care.
9:11 am on Apr 20, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



welcome to WebmasterWorld, NoIdea!


when server responds to attacker with fake 200 while defacto performing 403 action

please describe a "fake 200" and a "de facto 403".
9:49 am on Apr 20, 2013 (gmt 0)



"Fake 200" means the code the bot gets; whereas "de facto 403" means real behavior of the server-victim. A lie, strictly speaking. Don't tell me that lie is sin:)
Any redirecting means you are admitting you've got a hit, agree? And hardly a lie to a bot is worse then 403 or redirecting:)
11:11 am on Apr 20, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



"please describe a fake 200 and a de facto 403" means what status code and content is provided with these responses?
11:48 am on Apr 20, 2013 (gmt 0)



Look: if I (server) return to him (a bot) a 403 Forbidden status - this means I admit I've got a hit. Then he (bot) can simply change his weapon (some intruding/spamming kit or simply a proxy.) This means his evolution, and his evolution doesn't bother me:), I'm not interested in it.
I only response with 200, may be send him some irrelevant page (bots can't read I assume:).
But I don't execute any action he requires, alike you do in the case of 403.
Please, help me to understand, which drawbacks are here.
12:54 pm on Apr 20, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



it sounds like you are describing a 200 OK status code and a lightweight document instead of a 403 for some or all requests by a bot or set of bots.
help us understand more about the problem.
how do you identify a "bad bot" or "bad bots"?
what resources are being requested?
are there any detectable patterns for the bot(s)? (IP, UA, referrer, rate/frequency/timing of requests, etc)

Please, help me to understand, which drawbacks are here.

typically the only requests you care about are actual human visitors and benevolent crawlers.
the "bad bots" should get whatever response takes the least resources.
1:19 pm on Apr 20, 2013 (gmt 0)



To determine "which guy is bad", IMHO it is sufficient to open your own htaccess file -et voila! You define them by Referrer or by IP etc. Then you do forbid them through a rule.
Please answer, which drawbacks you can find in the scenario above?
1:35 pm on Apr 20, 2013 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I sometimes respond to javabots and humans that are up to no good with a short timeout (503). The drawback to a 200 response, is that it will add or keep that page in the bot master's crawler list. That's how their software works. Then they share, trade, and sell their lists. End result, you draw a lot more attention to your site.
2:20 pm on Apr 20, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



why not a 404?
that shouldn't draw attention.
3:11 pm on Apr 20, 2013 (gmt 0)



But can I (server) send only 200 without returning any page? Then my server's load doesn't increase. And respond 403 may be issued later (or may be not, it depends..)
The thing rather simply can be made by means of PHP, but is it possible to make it on the Apache (or other webserver) level?
3:17 pm on Apr 20, 2013 (gmt 0)



Why not 404?
404 means NotFound. So it will try to find another page, and this means evolution, which IMHO would be better not to promote:)
Surely, lie is evil, but may be some exception can be established for bots?
7:58 pm on Apr 20, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



But can I (server) send only 200 without returning any page?

Sure. If you've ever been to a badly coded php site you can see how easy it is ;) You can do the same with a static html page.

RewriteCond {pile on all your bad-robot data here}
RewriteRule {request here} /badbot.html [L]


And then badbot.html is a perfectly blank piece of paper. So to speak.
11:45 pm on Apr 20, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



an internal rewrite to a minimal and minified static document as lucy24 suggested would be the best solution.
however, the bots may see a blank page as just as much of a kiss-off as a 403 would be.
it might be better if you had "some" content but still keep it small.
no images, no external styles or scripts.
12:14 am on Apr 21, 2013 (gmt 0)



indeed. Thank you!
It was a quite nice chat:)
Have a good day!
12:37 am on Apr 21, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



you might also check if the bad bots are sending cache or compression related request headers and reduce your server load by providing a sufficient response.
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved