homepage Welcome to WebmasterWorld Guest from 54.197.211.197
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
200 vs 403 status codes for bot attacks
NoIdea



 
Msg#: 4566471 posted 9:02 pm on Apr 19, 2013 (gmt 0)

Hi,
did anyone consider a scenario, when server responds to attacker with fake 200 while defacto performing 403 action?
If there are no any danger of DDoS etc, then why I should inform this bad guy he is banned? Let him errs...

 

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4566471 posted 10:07 pm on Apr 19, 2013 (gmt 0)

Are you talking about something like rewriting to a bad-robots page instead of smacking them with an explicit 403? Like the way many people deal with hotlinks, only now we're talking about pages instead?

I don't know if anyone has done exhaustive testing. It probably depends on the robot. And, of course, a rewrite just isn't as emotionally satisfying as an in-your-face 403 ;)

NoIdea



 
Msg#: 4566471 posted 12:05 am on Apr 20, 2013 (gmt 0)

Simply I do not want to promote an evolution of those... bots:)
We all do know what the evolution can achieve:) Let this process will be slower...
Is this of utter importance "to be honest" in the case of abusing your site?
Did anyone try to substitute 403->200 for the requesting bot only (foreign affairs:), prohibiting it at the same time (internal affairs:).

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4566471 posted 2:07 am on Apr 20, 2013 (gmt 0)

I really think it depends on the robot. I've met some that ran away in a huff when I started redirecting them to 127.0.0.1. Other robots simply don't care.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 9:11 am on Apr 20, 2013 (gmt 0)

welcome to WebmasterWorld, NoIdea!


when server responds to attacker with fake 200 while defacto performing 403 action

please describe a "fake 200" and a "de facto 403".

NoIdea



 
Msg#: 4566471 posted 9:49 am on Apr 20, 2013 (gmt 0)

"Fake 200" means the code the bot gets; whereas "de facto 403" means real behavior of the server-victim. A lie, strictly speaking. Don't tell me that lie is sin:)
Any redirecting means you are admitting you've got a hit, agree? And hardly a lie to a bot is worse then 403 or redirecting:)

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 11:11 am on Apr 20, 2013 (gmt 0)

"please describe a fake 200 and a de facto 403" means what status code and content is provided with these responses?

NoIdea



 
Msg#: 4566471 posted 11:48 am on Apr 20, 2013 (gmt 0)

Look: if I (server) return to him (a bot) a 403 Forbidden status - this means I admit I've got a hit. Then he (bot) can simply change his weapon (some intruding/spamming kit or simply a proxy.) This means his evolution, and his evolution doesn't bother me:), I'm not interested in it.
I only response with 200, may be send him some irrelevant page (bots can't read I assume:).
But I don't execute any action he requires, alike you do in the case of 403.
Please, help me to understand, which drawbacks are here.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 12:54 pm on Apr 20, 2013 (gmt 0)

it sounds like you are describing a 200 OK status code and a lightweight document instead of a 403 for some or all requests by a bot or set of bots.
help us understand more about the problem.
how do you identify a "bad bot" or "bad bots"?
what resources are being requested?
are there any detectable patterns for the bot(s)? (IP, UA, referrer, rate/frequency/timing of requests, etc)

Please, help me to understand, which drawbacks are here.

typically the only requests you care about are actual human visitors and benevolent crawlers.
the "bad bots" should get whatever response takes the least resources.

NoIdea



 
Msg#: 4566471 posted 1:19 pm on Apr 20, 2013 (gmt 0)

To determine "which guy is bad", IMHO it is sufficient to open your own htaccess file -et voila! You define them by Referrer or by IP etc. Then you do forbid them through a rule.
Please answer, which drawbacks you can find in the scenario above?

Key_Master

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 4566471 posted 1:35 pm on Apr 20, 2013 (gmt 0)

I sometimes respond to javabots and humans that are up to no good with a short timeout (503). The drawback to a 200 response, is that it will add or keep that page in the bot master's crawler list. That's how their software works. Then they share, trade, and sell their lists. End result, you draw a lot more attention to your site.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 2:20 pm on Apr 20, 2013 (gmt 0)

why not a 404?
that shouldn't draw attention.

NoIdea



 
Msg#: 4566471 posted 3:11 pm on Apr 20, 2013 (gmt 0)

But can I (server) send only 200 without returning any page? Then my server's load doesn't increase. And respond 403 may be issued later (or may be not, it depends..)
The thing rather simply can be made by means of PHP, but is it possible to make it on the Apache (or other webserver) level?

NoIdea



 
Msg#: 4566471 posted 3:17 pm on Apr 20, 2013 (gmt 0)

Why not 404?
404 means NotFound. So it will try to find another page, and this means evolution, which IMHO would be better not to promote:)
Surely, lie is evil, but may be some exception can be established for bots?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4566471 posted 7:58 pm on Apr 20, 2013 (gmt 0)

But can I (server) send only 200 without returning any page?

Sure. If you've ever been to a badly coded php site you can see how easy it is ;) You can do the same with a static html page.

RewriteCond {pile on all your bad-robot data here}
RewriteRule {request here} /badbot.html [L]


And then badbot.html is a perfectly blank piece of paper. So to speak.

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 11:45 pm on Apr 20, 2013 (gmt 0)

an internal rewrite to a minimal and minified static document as lucy24 suggested would be the best solution.
however, the bots may see a blank page as just as much of a kiss-off as a 403 would be.
it might be better if you had "some" content but still keep it small.
no images, no external styles or scripts.

NoIdea



 
Msg#: 4566471 posted 12:14 am on Apr 21, 2013 (gmt 0)

indeed. Thank you!
It was a quite nice chat:)
Have a good day!

phranque

WebmasterWorld Administrator phranque us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 4566471 posted 12:37 am on Apr 21, 2013 (gmt 0)

you might also check if the bad bots are sending cache or compression related request headers and reduce your server load by providing a sufficient response.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved