Forum Moderators: open
SetEnvIf User-Agent ^MSIECrawler keep_out
order allow,deny
allow from all
deny from env=keep_out
But the following UA took an entire directory from my site:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSIECrawler)"
24 files - all code 200 - and all within 24 seconds. Here is one access-log line:
195.112.34.112 - - [Date] "GET /myfolder/etc.html HTTP/1.1" 200 37072 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; MSIECrawler)"
(195.112.34.112 belongs to Nildram Dynamic ADSL Accounts, UK)
Why did my .htaccess file fail?
Apparently I was wrong: I thought I had taken that line from one of the forum discussions, but now I can't find it anywhere. So I must have added it myself.
As I understand it, then, ^ means "begins with". So is there a formula can be used to block a request based on a word/phrase that appears anywhere at all in a UA?
Thank you again.
Sure, no problem... Looking back at what I posted, I can't recommend adding the ")$" at the end
unless you test it thoroughly. The ")" might be interpreted as a special character, rather than a
literal, and that might break your deny again. Using Regular Expressions, you can "escape" the paren
by preceding it with a "\", but the allow,deny method does not use regular expressions in exactly
the same way that mod_rewrite does unless you force it to, and I use mod_rewrite. So, just leave
the "^" off the front end, and it will match anywhere in the string.
BTW, a few months ago, I did a search on "big G" for "regular expressions regex" and found a fairly
good primer on a .edu domain for sorting out all those hats and dollars (^$).
Cheers!
Jim