lucy24 - 8:01 pm on Jul 3, 2013 (gmt 0)
I would like everybody to be able to read my robots file, even if their user agent is blocked in my .htaccess.
The easiest way is:
Allow from all
That takes care of anyone blocked via mod_authz, like your ordinary "Deny from..." IP or UA blocks. (That is, mod_setenvif followed by "Deny from env=something-nasty".)
If any of your blocks are done in mod_rewrite, you will also need a line that says
RewriteRule ^robots\.txt - [L]
Put this at the very beginning of all RewriteRules.
But wait! You may not even need to do this part. (The <Files> envelope is always necessary.) I don't know about other people, but all my access-control rules are constrained to requests for pages-- final / or .html --so the server doesn't have to evaluate the rules for other requests like images. So if you don't have any ordinary pages ending in .txt, a request for robots.txt will sail on through anyway.
And wait a bit more, because you're not done yet.
On the <Files> side, you also need to allow everyone to see your 403 page-- plus any required styles. If you're on shared hosting and you use their default filename for error documents, they've already taken care of this. Otherwise you'll need another <FilesMatch> envelope. Or put all your error documents in a separate directory and give it a separate htaccess that says "Allow from all" directive of its own.
On the mod_rewrite side, you need another - [L] exception to cover any requests for the error pages. You don't need to muck about with RequestUri; put that part into the body of the rule. You hardly ever need RequestUri unless you're making a negative match.