Forum Moderators: phranque

Message Too Old, No Replies

Blocking from IP

Looking for a more effective way to control access by host (IP)

         

w3bmastine

8:19 am on Jan 28, 2016 (gmt 0)

10+ Year Member



Hi all,

I decided to clean up my .htaccess and get into the topic more. The snippet below is from my current .htaccess file (Apache 2.2).

# block ip addresses
# https://httpd.apache.org/docs/2.2/howto/access.html
# https://httpd.apache.org/docs/2.2/mod/mod_authz_host.html
deny from 64.79.100.24 # webcrawler.link
deny from 64.79.100.28 # webcrawler.link
deny from 64.79.100.11 # webcrawler.link
deny from 64.79.100.16 # webcrawler.link
deny from 64.79.100.44 # webcrawler.link
deny from 64.79.100.50 # webcrawler.link
deny from 64.79.100.58 # webcrawler.link
deny from 64.79.100.63 # webcrawler.link
deny from 81.144.138.34 # wotbox.com
deny from 81.144.138.40 # wotbox.com
deny from 90.199.136.222 # semalt
deny from 177.207.16.134 # semalt
deny from 179.211.155.20 # semalt
deny from 186.236.187.225 # semalt
allow from all


I currently study [httpd.apache.org...] and the linked resources...however, I need some extra help here. For example, if I want to block all requests from webcrawler.link, is it OK to just write a partial ip address? e.g.:

deny from 64.79.100  # webcrawler.link


I don't know how many IP address they have in 64.79.100 but I assume they range from 64.79.100.24 to 64.79.100.63.

I appreciate your help.
W3bmastine

not2easy

2:26 pm on Jan 28, 2016 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



You can block those IPs but that won't block all the pests you're trying to block. It might block real visitors at the same time. If you search this forum for "semalt" you can read a lot more about the best way to stop their referer spam. Most of us don't use IPs to block things that can show up from any IP.

Try something like:
RewriteCond %{HTTP_REFERER} (buttons|gratis|semalt|website) [NC,OR]
RewriteCond %{HTTP_USER_AGENT} (webcrawler|wotbox) [NC]
RewriteRule .* - [F]

For IP blocking, learn to identify and use CIDR blocks in a deny list. There is a lot of information on "what" and "how" in the Search Engine Spiders forum: [webmasterworld.com...] but for your benefit it is best to review your access logs to learn what you need to deal with.

lucy24

10:39 pm on Jan 28, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



deny from 64.79.100 # webcrawler.link

You can't do this.
The horse's mouth [httpd.apache.org] says (emphasis theirs, same text in 2.2 and 2.4):*
Comments may not be included on the same line as a configuration directive.


Now then:
deny from 64.79.100

Yes, that's perfectly fine, so long as it's a valid CIDR range. "64.79.100" is exactly the same as saying "64.79.100.0/24" at a savings of
:: counting on fingers ::
five bytes. This adds up quickly.


* But only after wild panic from google, because apparently the search phrase "syntax of comments" triggers a prove-you're-not-an-evil-robot captcha No, I don't know why I didn't find it on my own, since it's near the top of the page.

w3bmastine

11:40 am on Jan 29, 2016 (gmt 0)

10+ Year Member



@not2easy: I'm working on referers and user-agent names, too. I also read the Search Engine Spiders forum daily. Even contributed a little lately.

@lucy24:
Thanks for pointing out the comment syntax error. D'uh!

To be honest, I had to lookup "ICDR". Read the wikipedia article on it twice. I'm still confused.

I get that 64.79.100.0/24 = 64.79.100 (assuming there are 256 addresses). But I only know that there are 39 I want to block (from 64.79.100.24 to 64.79.100.63). So, with either,

deny from 64.79.100.0/24
or
deny from 64.79.100

I am blocking 256 addresses, instead of the 39 I actually want to block. Am I right? If yes, isn't there a more clever way to block only the addresses I want to block?
Thank you for your help so far!

EDIT:
I think I got it...

CIDR, 64.79.100.0 - 64.79.100.63:
64.79.100.0/26

CIDR, 64.79.100.24-64.79.100.63:
64.79.100.24/29
64.79.100.32/27

LilyTousi

3:36 pm on Jan 29, 2016 (gmt 0)

10+ Year Member



I have a question about blocking bad referrer and agent.
On my server, I have multiple addon domains. Of course, they all have their own .htaccess.
If I block the bad referrer at the public_html/.htaccess level, will it work for all addon domains ? or do I have to block in every addon ?
It will save me a lot of time to update the list.
Thanks

lucy24

10:00 pm on Jan 29, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



the 39 I actually want to block

But that's not awfully likely, unless you're dealing with infected human machines with fixed IP addresses. I've got
64.79.96.0/20
as a blocked server farm. So that's bigger than your 64.79.100.

If I block the bad referrer at the public_html/.htaccess level, will it work for all addon domains?

It should, assuming the blocks use mod_authzthingy and mod_setenvif directives, which are inherited straight down the line. It's only with mod_rewrite that you need to worry about inheritance. My host has the "userspace" structure, where all your sites are parallel; most access control is in the higher-level directory shared by all sites.

LilyTousi

2:21 pm on Jan 31, 2016 (gmt 0)

10+ Year Member



thanks lucy.
I am wondering if there is a way to test if a referrer is actually blocked, other than wait to see (rather not seen) in the raw access log ?
If I edit the .htaccess of one of my not popular addon domain and add my own internet provider. Will I see a 403 error ?
thanks for your help.

lucy24

10:09 pm on Jan 31, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If I edit the .htaccess of one of my not popular addon domain and add my own internet provider. Will I see a 403 error ?

You should. If you don't have a fixed IP address, use one of the free online lookups to learn what your IP is right now, and write your rules for that exact address.

Some browsers may let you send a fake referer. Otherwise, you can test by making a phony page called something like eviltestnamehere.html and give it a link to your real site. If your live htaccess has a rule for "eviltestnamehere" (without anchors), you should find yourself blocked when you try to follow the link.

wait to see (rather not seen) in the raw access log

Unless you've got a really horrible host, you will see it in logs, because all requests are logged, regardless of response. (A 403 may or may not show up in error logs, depending on selected logging levels. But that's separate.)

robzilla

11:04 pm on Jan 31, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



RewriteCond %{HTTP_REFERER} (buttons|gratis|semalt|website) [NC,OR]

The word "website" is not uncommon enough to justify blocking it outright, in my opinion. The same could be said of "gratis", especially if you have an international audience. If your website is about widgets and I use a non-HTTPS search engine to look for "widgets gratis" or "widgets website" and I click on a result from your site, I'll be blocked. A blogger writes a post about your site, titles it "I love this website!" (with corresponding URI) -- same thing & not worth blocking a bit of spam that you could just as well filter out of your analytics.

lucy24

12:21 am on Feb 1, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



If your website is about widgets and I use a non-HTTPS search engine to look for "widgets gratis" or "widgets website" and I click on a result from your site, I'll be blocked.

Some types of referer lockouts have to be constrained a little more tightly, for example

:: shuffling papers ::
^http://[\w.]+(--|xx)
which means "only lock them out if the offending text is part of the referring domain name". (It now occurs to me it should be https? but I guess referer spam never bothers about security. That's an actual line from my real htaccess, meaning it was inspired by some real-life offending visitor.) The next step would then be
^http://[^?]+(--|xx)
meaning "anywhere in the path" but still not in the query (which is where search terms, if visible, would go).

robzilla

9:24 am on Feb 1, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



The search query was an example, I just think "website" is way too generic to be blocked. Other examples include a website that includes the word "website" in its (domain) name, e.g. widgetswebsite.com, or even widgets.website if you want to go a little crazy these days, or has a page linking to your from a path like "/top-10-websites-about-widgets.html". Just gotta be careful :-)