Page is a not externally linkable
wilderness - 3:30 pm on May 13, 2012 (gmt 0)
Have you ever thought about using the HTTP_FROM header? I've never seen it spoofed or used from a specified IP range from a non-crawler. It also simplifies the solution in your case.
SetEnvIfNoCase User-Agent (google|msn|bing)bot dont_allow
SetEnvIf From ^googlebot\(at\)googlebot\.com$ !dont_allow
SetEnvIf From ^bingbot\(at\)microsoft\.com$ !dont_allow
SetEnvIf Request_URI ^/robots\.txt$ !dont_allow
RewriteCond %{ENV:dont_allow} ^1$
RewriteRule .* - [F]
Key_master,
Despite my longevity in this forum, my methods are quite simple.
If in the past decade somebody provided an example of headers that I comprehended, than I implemented it (if effective) and if I didn't comprehend the use (at least enough that I could expand my simplistic capabilities) than I didn't use it.
The only active example that I have for header checks was for the AVG thing.
My apologies, however I don't understand how the lines you provided compares the IP's to the UA's?
1) Don't wish to allow access based upon IP's alone
2) or UA's alone.
(Note just using the google UA would also let in the numerous fakers that appear fairly often and in bunches).
See this thread [webmasterworld.com] for a prime example.