homepage Welcome to WebmasterWorld Guest from 54.211.97.242
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Unravel UA Blocking when it's not obvious?
htaccess UA Blocking
EastTexas



 
Msg#: 4673999 posted 6:11 pm on May 23, 2014 (gmt 0)

How do I unravel Which UA is blocking when it's not obvious?

The IP's are OK.
The web page loads OK, but external files are 403.

wannabrowser.com DOA?



c-50-130-36-106.hsd1.ms.comcast.net

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E)


host721680011798.direcway.com

Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53


It could be one of these?

I have more code from:
[perishablepress.com...]

# Mozilla/4.0 (compatible;)
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} compatible;\)$
RewriteRule . . [F,L]


<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(\.|\*|;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(md5|benchmark|union|select|insert|cast|set|declare|drop|update).* [NC]
RewriteRule ^(.*)$ - [F,L]
</IfModule>


I have a long list of BAD UA's...

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]


RewriteCond %{HTTP_USER_AGENT} ^.*(comodo\ spider|microsoft\ url\ control|seo\ robot|windows\ 3|windows\ 3.1|windows\ 3.11|windows\ 95|windows\ 98|win98|win\ 9x|windows\ 2000|win32).* [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^.*(firefox/1[0-7]|=mozilla|22mozilla|monzilla).* [NC,OR]

RewriteRule ^(.*)$ - [F,L]
</IfModule>

Thanks in advance 8)

 

not2easy

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



 
Msg#: 4673999 posted 7:35 pm on May 23, 2014 (gmt 0)

Please tell us you do not have all these snippets in your htaccess file?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 7:35 pm on May 23, 2014 (gmt 0)

Is it your own site? If so, you can run a RewriteLog to see exactly what element triggered which rule. If it's shared hosting there's not a lot more you can find out. Error logs never say anything but "Client denied by server configuration" which, ahem, we already know ;)

The web page loads OK, but external files are 403.

Now that's weird. I often see the opposite, because most of my rules are constrained to requests in html. The likeliest culprit here is not UA but a referer-based lockout: a request for a page could come in with anything as referer, but a supporting file should name either your own page or nothing.

EastTexas



 
Msg#: 4673999 posted 8:28 pm on May 23, 2014 (gmt 0)

Yep, it's my site on a shared host ;/

Please tell us you do not have all these snippets in your htaccess file? - MOST of 'em!

I also use a long list of REFERER Blockers


# Block REFERER visits
RewriteEngine on
RewriteCond %{HTTP_REFERER} \.af [NC,OR]
RewriteCond %{HTTP_REFERER} \.cn [NC,OR]
RewriteCond %{HTTP_REFERER} \.de [NC,OR]
RewriteCond %{HTTP_REFERER} \.ru [NC,OR]
RewriteCond %{HTTP_REFERER} \.su [NC,OR]
RewriteCond %{HTTP_REFERER} ahrefs\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute-1.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} archive\.org [NC,OR]
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
RewriteCond %{HTTP_REFERER} keyweb [NC,OR]
RewriteCond %{HTTP_REFERER} pills [NC,OR]
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]
RewriteCond %{HTTP_REFERER} proxy [NC,OR]
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} seznam\.cz [NC,OR]
RewriteCond %{HTTP_REFERER} xxx [NC,OR]
RewriteCond %{HTTP_REFERER} zequn\.com
# RewriteRule .* - [F]
# Redirect [perishablepress.com...]
RewriteRule ^(.*)$ [%{REMOTE_ADDR}...] [F,L]

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 9:38 pm on May 23, 2014 (gmt 0)

RewriteCond %{HTTP_REFERER} \.de [NC,OR]
<snip>
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
<snip>
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]

Huh what? This looks very much like a list that's been cobbled together from various sources with no subsequent attention.

What have you got against seznam? afaik it's a perfectly legitimate search engine.

I've never seen most of those nasties-- amazon AWS, hetzner etc-- in the "referer" slot. Do you really get enough that it's worth putting your server to the extra work instead of just blocking by IP?

semalt is an annoying referer, because they tend to come in through infected browsers (or, for all I know, voluntarily participating "distributed robotics" users). So they not only ask for the page, but also for its supporting files. Generally with the wrong form of your domain name, because 403 trumps 301.

RewriteCond %{HTTP_REFERER} \.ru [NC,OR]

See above about "no subsequent attention". What have you got against www.rum.com, www.runningshoes.com, www.rushing.com, www.rustynails.com and so on? (Moderators, please note that I'm just making up names off the top of my head.)

EastTexas



 
Msg#: 4673999 posted 11:53 pm on May 23, 2014 (gmt 0)

hetzner.de, plusserver.de, seznam.cz sends lots of bots to my site.

hetzner.de, plusserver.de harbors lots of REFERER Spammers posted by .RU & .UA. Based IP's.

compute.amazonaws.com, compute-1.amazonaws.com sends several java & other bots (Java UA is Blocked)

I not only block their domain/IP, I also block their ns.servers too ;}

RewriteCond %{HTTP_REFERER} \.ru [NC,OR] Should only block Domains ending in .ru Right?

If I'm doing it wrong; Which ones should I zap?

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 12:56 am on May 24, 2014 (gmt 0)

Do like this:

RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]
Mine's got a second condition-- case sensitive-- that exempts google and yandex.

Check your referer blocks periodically. (Even though error logs won't say explicitly, they're easy to identify because most robots don't include a referer at all.) Most of them will turn out to come from IPs that can be blocked in their own right, creating less work for your server.

There are probably even more words starting in "de" than in "ru". Admittedly "ua" will be rare. (uakari.net? Hmm.)

EastTexas



 
Msg#: 4673999 posted 2:29 am on May 24, 2014 (gmt 0)

RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]
I have used this in the past, it became a little too heavy handed. se|my|ru|ua would end up blocking search.domain.com & myvzw.com. I want to target on the domains ending in .cn, .ru, .ua & more on my naughty list.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 3:34 am on May 24, 2014 (gmt 0)

:: sigh ::

Someone else explain it, please. What we have here is a failure to communicate.

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4673999 posted 7:48 pm on May 24, 2014 (gmt 0)

If you want to block stuff that "ends in" then your RegEx pattern must have an "anchor" to make it so.

So you will need this:
\.(ru|ua)($|/)

It will match anything that contains .ru or .ua with either
- absolutely nothing after it, or
- a slash and some unspecified stuff after that.

In all cases, that equates to "hostname ends with .ua or .ru".

EastTexas



 
Msg#: 4673999 posted 12:07 am on May 25, 2014 (gmt 0)

Something like this?
RewriteCond %{HTTP_REFERER} \.(cn|cz|nl|ru|su|ua)($|/) [NC,OR]

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 12:37 am on May 25, 2014 (gmt 0)

Something like this?

You mean, something like exactly what I said five posts ago? Yup.

EastTexas



 
Msg#: 4673999 posted 9:59 pm on May 25, 2014 (gmt 0)

Nope ;}
lucy24 (/|$) vs g1smd ($|/) I don't know IF it matters or not?

g1smd

WebmasterWorld Senior Member g1smd us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 4673999 posted 11:19 pm on May 25, 2014 (gmt 0)

No. It doesn't matter. They are equivalent.
I find it less confusing to to write it the way I have.

lucy24

WebmasterWorld Senior Member lucy24 us a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



 
Msg#: 4673999 posted 11:26 pm on May 25, 2014 (gmt 0)

A pipe separates two equivalent options. Within a pipe-separated group it makes absolutely no difference whether something is on the left or on the right. Doesn't matter whether you are separating two items as in
($|/)
or many as in
\.(cn|cz|nl|ru|su|ua)

Putting the / before (i.e. to the left of) the $ might conceivably shave a nanosecond off processing time-- but only if you postulate that text in a Regular Expression is read faster by the server when it knows it won't have to capture anything more until it reaches the closing parenthesis (because it has already found a match). Either way, the server has to continue reading, because it doesn't know whether there will be more stuff after that closing parenthesis.

If you've got only two items-- or multiple unrelated items, like a string of RewriteConds looking at various aspects of the request-- you might choose to put them in order of likelihood-to-occur. When you're listing many of the same thing, like your list of six possible tld's, keeping them in alphabetical order makes most sense.

You can see that this is getting into serious hair-splitting territory.

I find it less confusing to to write it the way I have.

And for me it's more intuitive to put them in conceptual order, so
(^|&)
but
(&|$)
;)

with hasty edit to get rid of unwanted smileys

EastTexas



 
Msg#: 4673999 posted 11:49 pm on May 25, 2014 (gmt 0)

Thanks again everyone for the help 8)

FYI - I did clean up \.cn & baidu\.cn

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved