Forum Moderators: phranque

Message Too Old, No Replies

Blocking IP ranges

Looking for an easy to understand tutorial...

         

LunaC

6:25 pm on Oct 20, 2006 (gmt 0)

10+ Year Member



I have piles of IP ranges I need to block and have been looking everywhere to find a basic, beginner tutorial that is written in plain english, isn't piled with tech terms and that doesn't assume the reader already has advanced knowledge of apache servers..

So far from what I understand mod access is easier to learn that doing this by mod rewrite, so if that's the case I'd prefer that.

Does anyone know of a super basic tutorial? Or maybe a tool I can type in the ranges and it'll spit out the code? (heh, I know, not likely.. but I can dream of any easy way out of this mess ;) )

tsalmark

3:54 am on Oct 21, 2006 (gmt 0)

10+ Year Member



if your IP's are rather static or you are good with scripting .htaccess is relatively simple and easy to use.
See: www.javascriptkit.com/howto/htaccess5.shtml
remember that htaccess affects subdirectories unless overridden, if you want to affect the whole site then put in your root directory.

[edited by: jdMorgan at 4:08 pm (utc) on Oct. 21, 2006]
[edit reason] De-linked [/edit]

jdMorgan

4:21 pm on Oct 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, the problem is that in order to make use of the advanced capabilities of Apache, you *need* to learn the tech terms, some basic scripting skills -- not that I'm calling .htaccess a script, because it isn't, but that's the closest term to describe what's needed -- and have some familiarity with regular expressions pattern-matching.

That is the purpose of this forum: To help you learn how to do this for yourself. As such, specific questions are always welcome. In many cases, you'll be referred back to the Apache documentation, so that's a good place to start. Once you've got a few specific questions based on reviewing the docs, please do post them here.

In this case, the directives you're looking for are described in the mod_access documentation [httpd.apache.org]. We also had some discussion of a combined mod_setenvif/mod_access method in this recent thread [webmasterworld.com].

Jim

wilderness

4:49 pm on Oct 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



[webmasterworld.com...]
[webmasterworld.com...]

Here are some other explanations for you to explore:
(The first likely the simpliest answer)

[webhelpinghand.com...]
[baremetal.com...]
[edginet.org...]
[dimi.uniud.it...]
[webhelpinghand.com...]

and if none of that provides enough depth?
You may begin where I did, before joining WebMaster World:
[google.com...]

LunaC

4:06 pm on Oct 22, 2006 (gmt 0)

10+ Year Member



Thank you so much, with that info I think I finally am getting an understanding of what I need to do. I was looking at a much harder way, but I think this will block ranges using cidr (that I can find easier than the more complex (regex?) expressions I was trying to understand).

Here's a stripped down snippet of what I'm going to use.

SetEnvIfNoCase User-Agent "Missigua" bad_bot
<Files *>
Order Deny,Allow
Deny from env=bad_bot
Deny from ###.##.##.0/19
Deny from ##.##.##.160/28
Allow from all
</Files>

That basically says "everyone is allowed by default (Order Deny,Allow), except those IP ranges and anyone called a bad_bot" right? Are there any obvious errors? (wrong order, badly written, or using cidr not as effective as other ways?)

I do have another question, is using <files *>, then allowing everything a bad idea? Would that say that anyone has rights that normally the server wouldn't allow (ie more than the usual get, post, head, etc)?

*edit*
Another question, I often see the same as above written without the <files> tags, what's the difference?

[edited by: LunaC at 4:10 pm (utc) on Oct. 22, 2006]

wilderness

6:50 pm on Oct 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I haven't a clue what the files line is for, perhaps Jim or one of the others may provide insight.

Order Correction

<Files *>
SetEnvIfNoCase User-Agent "Missigua" bad_bot
Order Deny,Allow
Deny from ###.##.##.0/19
Deny from ##.##.##.160/28
Allow from all
Deny from env=bad_bot
</Files>

LunaC

1:07 pm on Oct 23, 2006 (gmt 0)

10+ Year Member



OK, this is odd...

I changed the order to what Wilderness posted yesterday afternoon. Looking at todays logfiles the 403's based on IP are still not being blocked, bad_bots are getting 403's. (I'd had the code before written as I'd first posted, same thing.. blocked IP's geting through, bad_bots 403'd)

Here's the details:
1) The IP's that should be blocked are getting stuck in a 301 as they had before I tried blocking.
(I'm guessing they are requesting / with the www and sent to / non www. That's what first cought my attention.. huge blocks of logfile claiming to be gbot getting stuck looping the same request hundreds of times .. kind of hard to miss.. IP's are not gbot, they're from hosting companies, common behavior of scrappers.) I forgot to mention this part before, didn't seem important till now.

2) I have the banning code before any redirects in htaccess (so shouldn't they get sent a 403 before even hitting a redirect?)

3) As I've said, bad_bots are getting 403. The should-be-banned IPs are not. I've triple checked and the CIDR is exactly as found in dnsstuff, reverse CIDR/Netmask test confirms the IP range is right.

So I'm more than a bit lost, no idea what I should be looking for now.

jdMorgan

1:34 pm on Oct 23, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need to re-read the description of the Order directive in the Apache mod_access documentation [httpd.apache.org] *very* carefully.

You can also omit the <Files> container if you like.

Jim

LunaC

1:54 pm on Oct 23, 2006 (gmt 0)

10+ Year Member



Aha, OK, I finally get it, or at least I think I understand it now :)

I tested using Order Allow,Deny (after reading why a few times it finally sunk in as to why that might make more sense) and banned my own IP.. finally a 403.

Thank you so much both of you for your help. I still haven't found a beginner tutorial explaining how to write regular expressions, but this way is working, so I'm OK for now. :)

gregbo

10:13 pm on Oct 31, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



There are a lot of books on regexps, such as O'Reilly's Mastering Regular Expressions.