-- Search Engine Spider and User Agent Identification
---- Filtering Out Really Hard To Find Bad Bots
incrediBILL - 2:38 am on Jan 23, 2013 (gmt 0)
I have similar traps, but I find they only return a small percentage of the total trapped.
Yes, but it's often stuff pretending to be a browser that stumbles into those traps that might otherwise go unnoticed. It's why those types of bots also read robots.txt trying to avoid those traps which is why robots.txt is also a trap.
When you get down to filtering out stealth crawlers, the Picscouts and other data miners that really don't want you to know they're watching you, every little trap helps as well as checking for little tells.
It's the combination of all the filters and tracking bugs that catch the sneakiest so I never discard any method just because of a low rate or return/ Just the mere fact that it's catching something that slipped thru all the other cracks means it's a useful tool to keep in your arsenal.