Msg#: 3500232 posted 3:24 pm on Nov 9, 2007 (gmt 0)
I'm sorry, I have to ask what you are trying to accomplish here.
Anyone who knows what a robots.txt file is and wants to read yours will be able to just spoof the user agent. I don't think that's any kind of secret hacker knowledge. And regular users will never see the robots.txt file.
If you really want to down your robots.txt file down, you may have better luck by allowing access only to IP addresses of known spiders.
Msg#: 3500232 posted 3:55 pm on Nov 9, 2007 (gmt 0)
I agree. robots.txt and your custom 403 error page are two files which should be universally-accessible, even to banned IP addresses and user-agents. The reasoning here is that if robots.txt is inaccessible, robots are likely to interpret that as meaning that the entire site may be spidered. And of course, if you return a 403 response when a banned user-agent tries to access your custom 403 page, then your server ends up in a loop.
That said, I'm not sure about a 501-Not Implemented error, but the missing "%" in your first RewriteCond may cause a 500-Server Error.
Msg#: 3500232 posted 9:35 pm on Nov 9, 2007 (gmt 0)
Be prepared to monitor it closely then, and do not make any mistakes with user-agents or IP address ranges. Check the major search engine spider IP address ranges at least once every week so you don't block them and lose your rankings. This is cloaking, and cloaking successfully is a full-time job.
Just my opinion, but I think there are better things to spend your time on. Why not let your competitors see your robots.txt, but put a few extra Disallow entries in there that don't really exist, and are not linked from anywhere on the Web? Then if you ever see an access to one of those Disallowed fake URL-prefixes, you can rewrite it to a script that bans the IP address. ;)