Forum Moderators: open
This user agent is driving me nuts:
"Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"
Really at the end of my rope down here, and not technical enough to implement a bot detection & banning script.
There was a recent thread on this as well:
[webmasterworld.com...]
all you really need to do is include a file at the top and bottom of your page and it will block stuff based on their behaviour (the speed which they download stuff at, the number of pages they access per minute etc), exactly like you want.
yeah have it flagged already, do you have any idea how to install it in plain vanilla html pages?
RewriteCond %{HTTP_USER_AGENT} MSIE.+Windows [NC]
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/4\.[0-9]+\ \(compatible;\ MSIE\ [3-9]\.[0-9]{1,2}(;\ [^;]+)*;\ Windows\ (NT\ (4\.0¦5\.(01?¦1¦2)¦6\.0)¦98;\ Win\ 9x\ 4\.90¦98¦95)(;\ [^;]+)*\)
# Following line allows screwed-up syntax "MSN 9.0;MSN 9.1" user-agent
RewriteCond %{HTTP_USER_AGENT} !^Mozilla/4\.[0-9]+\ \(compatible;\ MSIE\ [3-9]\.[0-9]{1,2}(;\ [^;]+)*;\ Windows\ (NT\ (4\.0¦5\.(01?¦1¦2)¦6\.0)¦98;\ Win\ 9x\ 4\.90¦98¦95)(;\ [^;]+)*;\ +MSN\ 9\.0;MSN\ 9\.1(;\ [^;]+)*\)
RewriteRule .* - [F]
Replace all broken pipe "¦" characters with solid pipes before use; Posting on this forum modifies the pipe characters.
Jim
[edited by: jdMorgan at 6:45 pm (utc) on Nov. 3, 2007]
do you have any idea how to install it in plain vanilla html pages?
you'd have to set your server up to parse pages with an .html extension as php.
then you can just include a php code block at the top and bottom.
if you've got access to your .htaccess file then i think you can just add one simple line to it... but i don't use it myself, so i don't know what it is! maybe someone else will chime in with it.
This is what I currently have:
AddType application/x-httpd-php .html .htm .txt
php_value auto_prepend_file "/home/dir/public_html/botrdns.php"SetEnvIfNoCase User-Agent "somebotUA" bad_bot
SetEnvIfNoCase User-Agent "someotherbotUA" bad_bot<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit><Files 403.shtml>
order allow,deny
allow from all
</Files>deny from aaa.bbb.ccc.ddd
Apache Tutorial: .htaccess files
[httpd.apache.org...]
Hobbs, what you have now seems fine. In general structure isn't awfully important as few things are order-dependant; other than multiple directives within the same module.
Actually, some method and consistency (such as remark lines, which I personally don't use) are quite useful. Especially in the event that your required to pour over line after line of an htaccess file, searching for a syntax error because an addition has created a 500 error taking down your entire website (s).
AddType application/x-httpd-php .html .htm .txt
php_value auto_prepend_file "/home/dir/public_html/botrdns.php"SetEnvIfNoCase User-Agent "somebotUA" bad_bot
SetEnvIfNoCase User-Agent "someotherbotUA" bad_bot<Limit GET POST>
Order Allow,Deny
Allow from all
Deny from env=bad_bot
</Limit><Files 403.shtml>
order allow,deny
allow from all
</Files>deny from aaa.bbb.ccc.ddd
These files work in a vareity of fashions (similar in that manner to the html behind web pages).
More organized-asoccicated in the following:
AddType application/x-httpd-php .html .htm .txt
php_value auto_prepend_file "/home/dir/public_html/botrdns.php"
<Limit GET POST>
SetEnvIfNoCase User-Agent "somebotUA" bad_bot
SetEnvIfNoCase User-Agent "someotherbotUA" bad_bot
Order Allow,Deny
deny from aaa.bbb.ccc.ddd
Allow from all
Deny from env=bad_bot
</Limit>
perhaps another may provide a more effective positioning of the following lines?
As I don't use these files, it appears to be both duplication and conflict, however I could be mistaken,
<Files 403.shtml>
order allow,deny
allow from all
</Files>
Don
RewriteCond %{HTTP_USER_AGENT} 98)$
RewriteRule .* - [F]
Checked logs several hours later and found I'd blocked a couple dozen legit users (USA, Brazil, Mexico...) Apparently my sites attract the antiquated.
SetEnvIfNoCase User-Agent "Windows\ 98\)$" bad_bot
SetEnvIfNoCase User-Agent "win98" bad_bot
jdMorgan,
Thanks for the code, I inserted it in a test site and it is working fine so far.
Can you blend SetEnvIfNoCase Remote_Host and User-Agent in one line?
Say for example I want to only block "validexample" user agent from IP 1.2.3.4 which is a proxy in this case sending many visitors but I only need to block one of them that has a valid user agent.
[webmasterworld.com...]
RewriteCond %{HTTP_USER_AGENT} validUA
RewriteCond %{REMOTE_ADDR} ^123\.456\.789\.
RewriteRule .* - [F]
I am not sure if that would work in multiples, i.e. blocking multiple valid UA's from multiple proxies, that's why I was hoping for a line of SetEnvIfNoCase for each one, e.g.
SetEnvIfNoCase User-Agent "UA1" and Remote_Host "1.2.3.4" bad_bot
SetEnvIfNoCase User-Agent "UA2" and Remote_Host "5.6.7.7" bad_bot
RewriteCond %{HTTP_USER_AGENT} (validUA1¦validUA2)
RewriteCond %{REMOTE_ADDR} ^123\.456\.789\. [OR]
RewriteCond %{REMOTE_ADDR} ^234\.567\.891\.
RewriteRule .* - [F]
(Please note that the forum breaks the pipe character and needs correction.)
Your may also utilize UA-begins with, ends with or conatains as options for your UA keyword, however would not suggest attempting to mix begins with, ends with or conatains in the same criteria line.
Your previous inquiry:
I am not sure if that would work in multiples, i.e. blocking multiple valid UA's from multiple proxies
In the event that you desire single entries (per IP and UA) than just use the example your copied from the aforemwntioned link example.
edited: BTW, the benefit the multiple IP range is that you have the capability of adding as many ranges as you desire.