- If you choose to block problem actors from your website - There are several methods of blocking:
• Check Header fields and block if abnormal
• Block Server Farm IP Ranges
• Block by behavior: requests too fast, requests for pages but no other file types, supporting files but no pages, using same page as referrer, requesting same file redundantly more than 3Xs.
• Block by User Agent: block known scrapers & malicious actors Search Engine Spider & User Agent ID Forum
• Block if no UA
• Block if HTTP/1.0 - this is an old protocol in use by mostly older bots.
• Block if changing UAs more than 3Xs. Sometime proxy & VPN users (example: schools) will use the same IP address but some users will have a different UA, however scraping software may change UAs often as a means of access.
• Block by referrer: hot-linking, bad neighborhoods, etc
• Block if redundant requests for same page more than 3Xs within a time frame. Some bots request files very fast, beyond what a browser does.
*Blocking server ranges may or may not be an effective defense for unwanted activity at your web site. Hosting companies lease ranges to a wide variety of clients, not all necessarily negative to your site's interests. Some may be extremely helpful.
Note: If you choose to block without prejudice, be prepared to watch your server logs each day with diligent focus to see just who exactly is being blocked. This takes consistent maintenance.