Welcome to WebmasterWorld Guest from 54.144.124.152

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

Unravel UA Blocking when it's not obvious?

htaccess UA Blocking

     

EastTexas

6:11 pm on May 23, 2014 (gmt 0)



How do I unravel Which UA is blocking when it's not obvious?

The IP's are OK.
The web page loads OK, but external files are 403.

wannabrowser.com DOA?



c-50-130-36-106.hsd1.ms.comcast.net

Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; InfoPath.2; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET CLR 1.1.4322; .NET4.0C; .NET4.0E)


host721680011798.direcway.com

Mozilla/5.0 (iPhone; CPU iPhone OS 7_1_1 like Mac OS X) AppleWebKit/537.51.2 (KHTML, like Gecko) Version/7.0 Mobile/11D201 Safari/9537.53


It could be one of these?

I have more code from:
[perishablepress.com...]

# Mozilla/4.0 (compatible;)
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} compatible;\)$
RewriteRule . . [F,L]


<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{QUERY_STRING} ^.*(localhost|loopback|127\.0\.0\.1).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(\.|\*|;|<|>|'|"|\)|%0A|%0D|%22|%27|%3C|%3E|%00).* [NC,OR]
RewriteCond %{QUERY_STRING} ^.*(md5|benchmark|union|select|insert|cast|set|declare|drop|update).* [NC]
RewriteRule ^(.*)$ - [F,L]
</IfModule>


I have a long list of BAD UA's...

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
RewriteCond %{HTTP_USER_AGENT} ^.*(<|>|'|%0A|%0D|%27|%3C|%3E|%00).* [NC,OR]


RewriteCond %{HTTP_USER_AGENT} ^.*(comodo\ spider|microsoft\ url\ control|seo\ robot|windows\ 3|windows\ 3.1|windows\ 3.11|windows\ 95|windows\ 98|win98|win\ 9x|windows\ 2000|win32).* [NC,OR]

RewriteCond %{HTTP_USER_AGENT} ^.*(firefox/1[0-7]|=mozilla|22mozilla|monzilla).* [NC,OR]

RewriteRule ^(.*)$ - [F,L]
</IfModule>

Thanks in advance 8)

not2easy

7:35 pm on May 23, 2014 (gmt 0)

WebmasterWorld Administrator 5+ Year Member Top Contributors Of The Month



Please tell us you do not have all these snippets in your htaccess file?

lucy24

7:35 pm on May 23, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Is it your own site? If so, you can run a RewriteLog to see exactly what element triggered which rule. If it's shared hosting there's not a lot more you can find out. Error logs never say anything but "Client denied by server configuration" which, ahem, we already know ;)

The web page loads OK, but external files are 403.

Now that's weird. I often see the opposite, because most of my rules are constrained to requests in html. The likeliest culprit here is not UA but a referer-based lockout: a request for a page could come in with anything as referer, but a supporting file should name either your own page or nothing.

EastTexas

8:28 pm on May 23, 2014 (gmt 0)



Yep, it's my site on a shared host ;/

Please tell us you do not have all these snippets in your htaccess file? - MOST of 'em!

I also use a long list of REFERER Blockers


# Block REFERER visits
RewriteEngine on
RewriteCond %{HTTP_REFERER} \.af [NC,OR]
RewriteCond %{HTTP_REFERER} \.cn [NC,OR]
RewriteCond %{HTTP_REFERER} \.de [NC,OR]
RewriteCond %{HTTP_REFERER} \.ru [NC,OR]
RewriteCond %{HTTP_REFERER} \.su [NC,OR]
RewriteCond %{HTTP_REFERER} ahrefs\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} compute-1.amazonaws\.com [NC,OR]
RewriteCond %{HTTP_REFERER} archive\.org [NC,OR]
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
RewriteCond %{HTTP_REFERER} keyweb [NC,OR]
RewriteCond %{HTTP_REFERER} pills [NC,OR]
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]
RewriteCond %{HTTP_REFERER} proxy [NC,OR]
RewriteCond %{HTTP_REFERER} semalt\.com [NC,OR]
RewriteCond %{HTTP_REFERER} seznam\.cz [NC,OR]
RewriteCond %{HTTP_REFERER} xxx [NC,OR]
RewriteCond %{HTTP_REFERER} zequn\.com
# RewriteRule .* - [F]
# Redirect [perishablepress.com...]
RewriteRule ^(.*)$ [%{REMOTE_ADDR}...] [F,L]

lucy24

9:38 pm on May 23, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



RewriteCond %{HTTP_REFERER} \.de [NC,OR]
<snip>
RewriteCond %{HTTP_REFERER} hetzner\.de [NC,OR]
<snip>
RewriteCond %{HTTP_REFERER} plusserver\.de [NC,OR]

Huh what? This looks very much like a list that's been cobbled together from various sources with no subsequent attention.

What have you got against seznam? afaik it's a perfectly legitimate search engine.

I've never seen most of those nasties-- amazon AWS, hetzner etc-- in the "referer" slot. Do you really get enough that it's worth putting your server to the extra work instead of just blocking by IP?

semalt is an annoying referer, because they tend to come in through infected browsers (or, for all I know, voluntarily participating "distributed robotics" users). So they not only ask for the page, but also for its supporting files. Generally with the wrong form of your domain name, because 403 trumps 301.

RewriteCond %{HTTP_REFERER} \.ru [NC,OR] 

See above about "no subsequent attention". What have you got against www.rum.com, www.runningshoes.com, www.rushing.com, www.rustynails.com and so on? (Moderators, please note that I'm just making up names off the top of my head.)

EastTexas

11:53 pm on May 23, 2014 (gmt 0)



hetzner.de, plusserver.de, seznam.cz sends lots of bots to my site.

hetzner.de, plusserver.de harbors lots of REFERER Spammers posted by .RU & .UA. Based IP's.

compute.amazonaws.com, compute-1.amazonaws.com sends several java & other bots (Java UA is Blocked)

I not only block their domain/IP, I also block their ns.servers too ;}

RewriteCond %{HTTP_REFERER} \.ru [NC,OR] Should only block Domains ending in .ru Right?

If I'm doing it wrong; Which ones should I zap?

lucy24

12:56 am on May 24, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Do like this:

RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]

Mine's got a second condition-- case sensitive-- that exempts google and yandex.

Check your referer blocks periodically. (Even though error logs won't say explicitly, they're easy to identify because most robots don't include a referer at all.) Most of them will turn out to come from IPs that can be blocked in their own right, creating less work for your server.

There are probably even more words starting in "de" than in "ru". Admittedly "ua" will be rare. (uakari.net? Hmm.)

EastTexas

2:29 am on May 24, 2014 (gmt 0)



RewriteCond %{HTTP_REFERER} \.(ru|ua)(/|$) [NC]
I have used this in the past, it became a little too heavy handed. se|my|ru|ua would end up blocking search.domain.com & myvzw.com. I want to target on the domains ending in .cn, .ru, .ua & more on my naughty list.

lucy24

3:34 am on May 24, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: sigh ::

Someone else explain it, please. What we have here is a failure to communicate.

g1smd

7:48 pm on May 24, 2014 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



If you want to block stuff that "ends in" then your RegEx pattern must have an "anchor" to make it so.

So you will need this:
\.(ru|ua)($|/)


It will match anything that contains .ru or .ua with either
- absolutely nothing after it, or
- a slash and some unspecified stuff after that.

In all cases, that equates to "hostname ends with .ua or .ru".

EastTexas

12:07 am on May 25, 2014 (gmt 0)



Something like this?
RewriteCond %{HTTP_REFERER} \.(cn|cz|nl|ru|su|ua)($|/) [NC,OR]

lucy24

12:37 am on May 25, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Something like this?

You mean, something like exactly what I said five posts ago? Yup.

EastTexas

9:59 pm on May 25, 2014 (gmt 0)



Nope ;}
lucy24 (/|$) vs g1smd ($|/) I don't know IF it matters or not?

g1smd

11:19 pm on May 25, 2014 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



No. It doesn't matter. They are equivalent.
I find it less confusing to to write it the way I have.

lucy24

11:26 pm on May 25, 2014 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



A pipe separates two equivalent options. Within a pipe-separated group it makes absolutely no difference whether something is on the left or on the right. Doesn't matter whether you are separating two items as in
($|/)
or many as in
\.(cn|cz|nl|ru|su|ua)

Putting the / before (i.e. to the left of) the $ might conceivably shave a nanosecond off processing time-- but only if you postulate that text in a Regular Expression is read faster by the server when it knows it won't have to capture anything more until it reaches the closing parenthesis (because it has already found a match). Either way, the server has to continue reading, because it doesn't know whether there will be more stuff after that closing parenthesis.

If you've got only two items-- or multiple unrelated items, like a string of RewriteConds looking at various aspects of the request-- you might choose to put them in order of likelihood-to-occur. When you're listing many of the same thing, like your list of six possible tld's, keeping them in alphabetical order makes most sense.

You can see that this is getting into serious hair-splitting territory.

I find it less confusing to to write it the way I have.

And for me it's more intuitive to put them in conceptual order, so
(^|&)

but
(&|$)

;)

with hasty edit to get rid of unwanted smileys

EastTexas

11:49 pm on May 25, 2014 (gmt 0)



Thanks again everyone for the help 8)

FYI - I did clean up \.cn & baidu\.cn
 

Featured Threads

Hot Threads This Week

Hot Threads This Month