homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

Unravel UA Blocking when it's not obvious?
htaccess UA Blocking

 6:11 pm on May 23, 2014 (gmt 0)

How do I unravel Which UA is blocking when it's not obvious?

The IP's are OK.
The web page loads OK, but external files are 403.

wannabrowser.com DOA?


Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E)


Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53

It could be one of these?

I have more code from:

# Mozilla/4.0 (compatible;)
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} compatible;\)$
RewriteRule . . [F,L]

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(\.|\*|;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(md5|benchmark|union|select|insert|cast|set|declare|drop|update).* [NC]
RewriteRule ^(.*)$ - [F,L]

I have a long list of BAD UA's...

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^.*(comodo\ spider|microsoft\ url\ control|seo\ robot|windows\ 3|windows\ 3.1|windows\ 3.11|windows\ 95|windows\ 98|win98|win\ 9x|windows\ 2000|win32).* [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^.*(firefox/1[0-7]|=mozilla|22mozilla|monzilla).* [NC,OR]

RewriteRule ^(.*)$ - [F,L]

Thanks in advance 8)



 7:35 pm on May 23, 2014 (gmt 0)

Please tell us you do not have all these snippets in your htaccess file?


 7:35 pm on May 23, 2014 (gmt 0)

Is it your own site? If so, you can run a RewriteLog to see exactly what element triggered which rule. If it's shared hosting there's not a lot more you can find out. Error logs never say anything but "Client denied by server configuration" which, ahem, we already know ;)

The web page loads OK, but external files are 403.

Now that's weird. I often see the opposite, because most of my rules are constrained to requests in html. The likeliest culprit here is not UA but a referer-based lockout: a request for a page could come in with anything as referer, but a supporting file should name either your own page or nothing.


 8:28 pm on May 23, 2014 (gmt 0)

Yep, it's my site on a shared host ;/

Please tell us you do not have all these snippets in your htaccess file? - MOST of 'em!

I also use a long list of REFERER Blockers

# Block REFERER visits
RewriteEngine on
RewriteCond %{HTTP_REFERER} \.af [NC,OR]
RewriteCond %{HTTP_REFERER} \.cn [NC,OR]
RewriteCond %{HTTP_REFERER} \.de [NC,OR]
RewriteCond %{HTTP_REFERER} \.ru [NC,OR]
RewriteCond %{HTTP_REFERER} \.su [NC,OR]
RewriteCond %{HTTP_REFERER} ahrefs\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute-1.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} archive\.org [NC,OR]
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
RewriteCond %{HTTP_REFERER} keyweb [NC,OR]
RewriteCond %{HTTP_REFERER} pills [NC,OR]
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]
RewriteCond %{HTTP_REFERER} proxy [NC,OR]
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} seznam\.cz [NC,OR]
RewriteCond %{HTTP_REFERER} xxx [NC,OR]
RewriteCond %{HTTP_REFERER} zequn\.com
# RewriteRule .* - [F]
# Redirect [perishablepress.com...]
RewriteRule ^(.*)$ [%{REMOTE_ADDR}...] [F,L]


 9:38 pm on May 23, 2014 (gmt 0)

RewriteCond %{HTTP_REFERER} \.de [NC,OR]
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]

Huh what? This looks very much like a list that's been cobbled together from various sources with no subsequent attention.

What have you got against seznam? afaik it's a perfectly legitimate search engine.

I've never seen most of those nasties-- amazon AWS, hetzner etc-- in the "referer" slot. Do you really get enough that it's worth putting your server to the extra work instead of just blocking by IP?

semalt is an annoying referer, because they tend to come in through infected browsers (or, for all I know, voluntarily participating "distributed robotics" users). So they not only ask for the page, but also for its supporting files. Generally with the wrong form of your domain name, because 403 trumps 301.

RewriteCond %{HTTP_REFERER} \.ru [NC,OR]

See above about "no subsequent attention". What have you got against www.rum.com, www.runningshoes.com, www.rushing.com, www.rustynails.com and so on? (Moderators, please note that I'm just making up names off the top of my head.)


 11:53 pm on May 23, 2014 (gmt 0)

hetzner.de, plusserver.de, seznam.cz sends lots of bots to my site.

hetzner.de, plusserver.de harbors lots of REFERER Spammers posted by .RU & .UA. Based IP's.

compute.amazonaws.com, compute-1.amazonaws.com sends several java & other bots (Java UA is Blocked)

I not only block their domain/IP, I also block their ns.servers too ;}

RewriteCond %{HTTP_REFERER} \.ru [NC,OR] Should only block Domains ending in .ru Right?

If I'm doing it wrong; Which ones should I zap?


 12:56 am on May 24, 2014 (gmt 0)

Do like this:

RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]
Mine's got a second condition-- case sensitive-- that exempts google and yandex.

Check your referer blocks periodically. (Even though error logs won't say explicitly, they're easy to identify because most robots don't include a referer at all.) Most of them will turn out to come from IPs that can be blocked in their own right, creating less work for your server.

There are probably even more words starting in "de" than in "ru". Admittedly "ua" will be rare. (uakari.net? Hmm.)


 2:29 am on May 24, 2014 (gmt 0)

RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]
I have used this in the past, it became a little too heavy handed. se|my|ru|ua would end up blocking search.domain.com & myvzw.com. I want to target on the domains ending in .cn, .ru, .ua & more on my naughty list.


 3:34 am on May 24, 2014 (gmt 0)

:: sigh ::

Someone else explain it, please. What we have here is a failure to communicate.


 7:48 pm on May 24, 2014 (gmt 0)

If you want to block stuff that "ends in" then your RegEx pattern must have an "anchor" to make it so.

So you will need this:

It will match anything that contains .ru or .ua with either
- absolutely nothing after it, or
- a slash and some unspecified stuff after that.

In all cases, that equates to "hostname ends with .ua or .ru".


 12:07 am on May 25, 2014 (gmt 0)

Something like this?
RewriteCond %{HTTP_REFERER} \.(cn|cz|nl|ru|su|ua)($|/) [NC,OR]


 12:37 am on May 25, 2014 (gmt 0)

Something like this?

You mean, something like exactly what I said five posts ago? Yup.


 9:59 pm on May 25, 2014 (gmt 0)

Nope ;}
lucy24 (/|$) vs g1smd ($|/) I don't know IF it matters or not?


 11:19 pm on May 25, 2014 (gmt 0)

No. It doesn't matter. They are equivalent.
I find it less confusing to to write it the way I have.


 11:26 pm on May 25, 2014 (gmt 0)

A pipe separates two equivalent options. Within a pipe-separated group it makes absolutely no difference whether something is on the left or on the right. Doesn't matter whether you are separating two items as in
or many as in

Putting the / before (i.e. to the left of) the $ might conceivably shave a nanosecond off processing time-- but only if you postulate that text in a Regular Expression is read faster by the server when it knows it won't have to capture anything more until it reaches the closing parenthesis (because it has already found a match). Either way, the server has to continue reading, because it doesn't know whether there will be more stuff after that closing parenthesis.

If you've got only two items-- or multiple unrelated items, like a string of RewriteConds looking at various aspects of the request-- you might choose to put them in order of likelihood-to-occur. When you're listing many of the same thing, like your list of six possible tld's, keeping them in alphabetical order makes most sense.

You can see that this is getting into serious hair-splitting territory.

I find it less confusing to to write it the way I have.

And for me it's more intuitive to put them in conceptual order, so

with hasty edit to get rid of unwanted smileys


 11:49 pm on May 25, 2014 (gmt 0)

Thanks again everyone for the help 8)

FYI - I did clean up \.cn & baidu\.cn

Global Options:
 top home search open messages active posts  

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved