Welcome to WebmasterWorld Guest from 54.162.184.214

Forum Moderators: Ocean10000 & incrediBILL & phranque

How Long to Ban Bad Bot Behaviour

     
8:16 pm on Apr 28, 2018 (gmt 0)

Full Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 333
votes: 23


So a bot has ravaged your site with a merciless scraping, done reconnaissance, tried to break in or some other bad behaviour, and you have banned it by bot UA, IP or IP of host. How long do you keep your ban? Years?

I collect the few really terrible and abhorrent host providers and ban their complete ranges. Apart from these consistently bad ISPs, I collect IP ranges, which I back date with comments. After 3 months I comment them out but keep their history, so that if they return it is easy to reban them.

Nothing stays the same, and changes can be quick quick. Bot UAs, IPs, host providers all change with time. If I keep all the bans the htaccess can get quite large, and may just be banning historical ranges and UAs.

Do you have a ban strategy or philosophy?
10:33 pm on Apr 28, 2018 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month

joined:Apr 9, 2011
posts:14718
votes: 614


In general, once someone is blocked, they stay blocked.

Back when my primary access control was IP-based, and UA blocks were purely supplementary, I did check back periodically. Then I'd delete the ones that hadn't been around in a year or so--unless their initial behavior was so “terrible and abhorrent”, I didn’t want to take any chances. This applies especially to scrapers attached to misguided humans (“hey, I love this site, lemme download the entire thing from top to bottom before even checking to see if maybe there’s only one directory with stuff I actually love”), since IP can’t be used as a blocking factor.

Now I do the opposite: UAs are blocked for the duration, unless of course they mend their ways. IP blocks are generally supplementary, and I check back every few months to see if they’ve gone away. The latter would be robots that are persistent and offensive--yee haw, let’s pile on the adjectives!--and that send fully humanoid headers, so an IP block is the only thing that works. Generally they get bored and go away, and don’t come back.

I do check for blocked humans and try to figure out what they did to get blocked. Most of the time it's something obvious where I can say “Well, tough, I didn’t want you anyway” but occasionally I’ve had to modify header-based rules, especially coming from mobiles. Or, in a particularly embarrassing case, I had to modify one of my carefully-crafted referer-based rules because I’d made a minor site-design change that resulted in everyone who clicked on a particular link getting blocked.
10:45 pm on Apr 28, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11528
votes: 702


I ban all Server Farm IP Ranges [webmasterworld.com] for eternity :)
I allow access to only those agents that benefit my interests. Some server farm IP ranges have ISPs (humans) so those get access as well.

UAs shown to be bad stay banned until I'm certain they no longer are in operation, usually a year or two, however some keep coming.

Blocking Methods [webmasterworld.com]
2:53 pm on Apr 30, 2018 (gmt 0)

Full Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 333
votes: 23


Thank you both. I will monitor my physical CPU utilization. It does not seem like a big htaccess has much effect on physical CPU usage. I will extend my 3 mo ban to a year and see if they return.
7:32 pm on Apr 30, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11528
votes: 702


. It does not seem like a big htaccess has much effect on physical CPU usage
My base level htaccess is over 200kb and Google's Page Speed tool says I have a "fast server."
7:55 pm on Apr 30, 2018 (gmt 0)

Full Member from CA 

Top Contributors Of The Month

joined:Feb 7, 2017
posts: 333
votes: 23


Thanks for the benchmark. My slimmer htaccess is currently 32kb, comments excluded, My old fat htaccess was 100kb, so I have a long way to grow.
8:37 pm on Apr 30, 2018 (gmt 0)

Moderator from US 

WebmasterWorld Administrator keyplyr is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Sept 26, 2001
posts:11528
votes: 702


Of course it depends what you have in your htaccess. Nested conditions/rules and excessive redirects would certainly add processing time, but in the big picture of things, a flat file is of the least concern regarding the factors contributing to page speed.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members