lucy24 - 6:33 pm on Jul 16, 2011 (gmt 0)
Robots get classified as either "No skin off my nose" or "I don't like your face", which bypasses most objective standards ;) Analogously: hotlinkers annoy me on principle-- and they annoy their sites' visitors even more, what with all that download time-- so they're all blocked even though the server load is minuscule overall. But everyone including the grimiest Byelorussian robot is allowed to see the 403 page, because the server weeps if they're not allowed to.*
But if I wanted to be principled about it I'd say that if you whitelist the known robots from approved sources, and lock out everyone else, you've wiped out any chance of someone starting up a genuinely new and interesting search engine. They gotta practice on someone, and their results can't possibly be any stranger than g###'s.
* I don't understand how or why this works, In the error logs, all blocked requests for an interior page-- but not the front page-- are followed with a request for the 403 page. So let's make the error processor happy. And any passing Chinese human can at least see my color scheme ;)