Forum Moderators: phranque

Message Too Old, No Replies

"Mozilla/4.0 (compatible;)"

Troubles blocking this without blocking almost everyone

         

LunaC

6:10 pm on Feb 1, 2007 (gmt 0)

10+ Year Member



I'm guessing these are bots or scrappers or something, I never see any pages taken, just images, scripts and css, and the IP ranges are all over and never is a referrer shown.

I'm having difficulty blocking that without blocking valid UAs.

here's an example from the logs:
###.###.###.###- - [31/Jan/2007:12:34:35 -0600] "GET /images/file.png HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible;)"

I've tried these and it didn't get blocked:

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ (compatible;)$ [NC]
RewriteRule .* - [F]

RewriteCond %{HTTP_USER_AGENT} "Mozilla\/4\.0\ (compatible;)" [NC]
RewriteRule .* - [F]

RewriteCond %{HTTP_USER_AGENT} "Mozilla/4\.0\ (compatible;)" [NC]
RewriteRule .* - [F]

SetEnvIfNoCase User-Agent (compatible;)$ banned

SetEnvIfNoCase User-Agent "Mozilla/4.0 (compatible;)" banned

And this one that of course blocked almost everyone:

SetEnvIfNoCase User-Agent "(compatible;)" banned

I tried searching but the final ) gets filtered out of every search I tried so finding the answer is proving to be difficult.

(Edited to disable smilies)

[edited by: LunaC at 6:10 pm (utc) on Feb. 1, 2007]

jdMorgan

6:12 pm on Feb 1, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The literal parentheses must be escaped for the pattern to match:

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible;\)$ [NC]
RewriteRule .* - [F]

Jim

LunaC

7:41 pm on Feb 1, 2007 (gmt 0)

10+ Year Member



Ah, thanks! I really should have seen that.

marodhum

1:23 pm on Mar 13, 2008 (gmt 0)

10+ Year Member



Hi,
i have this in my .htaccess

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible\;\)$ [NC,OR]
RewriteRule .* - [F,L]

But my log shows,

217.33.165.nnn - - [12/Mar/2008:08:40:18 -0400] "GET /default.css HTTP/1.1" 200 5821 "-" "Mozilla/4.0 (compatible;)"

It seems, that my rewrite rule is not working.
Thanks in advance for any comment/suggestion

As

marodhum

1:28 pm on Mar 13, 2008 (gmt 0)

10+ Year Member



Oops, guess here is the problem, dunno why couldn't find it earlier.
\(compatible\;\)$

Will see later, if it works now.

As

marodhum

1:48 pm on Mar 13, 2008 (gmt 0)

10+ Year Member



Nope it is still not working. :-(
Now the rule is

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4\.0\ \(compatible;\)$ [NC,OR]
RewriteRule .* - [F,L]

Please, somebody help me.

wilderness

3:09 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The literal parentheses must be escaped for the pattern to match:

Hey Jim,
I don't generally use Rewrites for UA's rather mod_SetEnvIf.

Just for clarification, is there a difference between the "escape" requirements of these two modules?

I have lines in mod_SetEnvIf that have been functioning for an eternity without use-requiring "escape" of parentheses.

Thanks.

Don

Samizdata

3:10 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If you have an OR on the last condition in a list it will not work.

You don't need the L on the rule either.

Why not try the example posted above by the forum moderator?

wilderness

3:11 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Please, somebody help me.

What do your error logs say?
Are these the only lines in you htaccess?
Have you turned on "Rewrite on" previous to this line?

jdMorgan

3:41 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Samizdata has identified the problem -- The [OR] flag on the last RewriteCond must be removed, and the code should be just as I posted above.

Jim

marodhum

5:56 pm on Mar 13, 2008 (gmt 0)

10+ Year Member



JDMorgan and Samizdata, no its not the last condition on my rewrite rule.. i just provided a snippet from my .htaccess.. Thats why the OR is there.

As it is the last rule, so i have a L there.

To Wilderness, yes i have
Options +FollowSymLinks
RewriteEngine on
as the first line.

Could not understand, what you asked me to check in my error log, as the request is getting code 200.. will there be anything there?

And many thanks to everyone for your help.

As

jdMorgan

9:00 pm on Mar 13, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> snippet...

In that case, go through the list of blocked user-agent RewriteConds, and be sure that you have a [OR] on every one of them except the last.

[L] used with [F] is redundant, as mod_rewrite stops processing immediately if the conditions for a Forbidden response are met. See the mod_rewrite documentation of [F] flag processing.

Assuming you've made the regular-expressions pattern changes suggested above, and have checked that all User-agent patterns are [OR]ed, then there is nothing wrong with the rule itself. Look for previous rules in this or any .htaccess files above this one which could be by-passing this rule by stopping mod_rewrite processing, or by internally rewriting the URL.

You could also try using only the code I provided above as a stand-alone rule, and move it up in your .htaccess file(s) a few rules at a time. If it starts working when moved toward the top of the file, then the problem is between that point and your original access-control rule.

Jim

marodhum

5:31 am on Mar 14, 2008 (gmt 0)

10+ Year Member



Just awesome reasoning jim.. I am speechless.

and have checked that all User-agent patterns are [OR]ed

The problem was lying there. I added a new rewrite condition, but forgot to [OR]ed it, which was causing the problem. Now it is working perfectly. :)

Once again, Hats off to you Jim and three cheers for WebmasterWorld.

As