Forum Moderators: phranque

Message Too Old, No Replies

IP blocked in htaccess getting 200 in logs

Why isn't this working to block this IP?

         

grandma genie

7:39 am on Oct 12, 2010 (gmt 0)

10+ Year Member



Hello,

I have the following range blocked in htaccess as such:

order allow,deny
deny from 91.21
allow from all

And yet this visitor shows up with this IP and gets a 200. I don't understand how that happened. It seems to be working for all the other IPs. Why not this one?

91.21.74.nn - - [11/Oct/2010:16:05:46 -0400] "GET /index.html HTTP/1.1" 200 31284 "mywebsitestuff" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 ( .NET CLR 3.5.30729)"

Jeannie

sublime1

2:59 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



I think these directives should work since you're telling Apache to process allow directives first, then deny directives, here's the quote from the apache doc

Allow,Deny
First, all Allow directives are evaluated; at least one must match, or the request is rejected. Next, all Deny directives are evaluated. If any matches, the request is rejected. Last, any requests which do not match an Allow or a Deny directive are denied by default.


In your case both directives match, so the Deny should do the trick. But this always requires three or four re-readings on my part, so I probably have it backwards.

Here's the relevant section of the Apache doc: [httpd.apache.org...]

Tom

jdMorgan

3:14 pm on Oct 12, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The code appears to be correct. However, if there are additional "Order" directives in this .htaccess file, or if the Allow/Deny section is enclosed in a <Files> or <FilesMatch> container, or if mod_access isn't available, then the code won't work properly.

You could always try
Deny from 91.21.0.0/16 

as an equivalent test, just to get more info.

Note that if you plan to use custom error documents, then it will be much easier to provide for serving the custom 403 error document to the unwelcome visitors if you use the Deny,Allow priority, rather than Allow,Deny.

Basically, you deny all unwelcome visitors from all resources, but then explicitly allow them to fetch your custom 403 error document. Otherwise, you can get an "infinite" loop, because access to the 403 response document will also be denied, triggering another (and another and another) 403 error in response to the first one.

I also recommend allowing all requests for your robots.txt file, so that unwelcome but robots.txt-compliant bots can fetch it, even if they are embedded in a 'bad' IP address range. That way, they will 'go away' after reading your robots.txt file adn seeing that they're unwelcome, and won't pester your server.

Jim

grandma genie

4:38 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



Does Apache get confused if you use two different methods? Like if some of the listed IPs are 91.21 and then others are 91.21.0.0/16.

As for custom error documents, I do not use it for 403 errors. Blocked visitors get the standard Apache Denied Access page. I do have a custom 404 error page.

As far as I know, I do not block access to the robots.txt file, but am not sure how to allow it to denied IPs.

Jeannie

grandma genie

5:19 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



Jim,
On a similar note, this visitor was looking for some images that are not on my server, at least I don't see them or the folder they are supposed to be in. Why would they be getting the HTTP 200 on the search. Here are the log entries:

72.84.49.nnn - - [11/Oct/2010:20:04:44 -0400] "GET / HTTP/1.1" 403 5044 "h**p://mail.aol.com/32783-111/aol-1/en-us/Suite.aspx" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
72.84.49.nnn - - [11/Oct/2010:20:04:46 -0400] "GET /icons/apache_pb2.gif HTTP/1.1" 200 2414 "www.mywebsite.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
72.84.49.nnn - - [11/Oct/2010:20:04:46 -0400] "GET /icons/powered_by_rh.png HTTP/1.1" 200 1213 "www.mywebsite.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"

I asked my web host and he said: it's scanning to determine what type of server is being used, etc. It's not a bad thing the folder is missing.

Is that right? I thought the http 200 meant they found what they were looking for.

Jeannie

logic_of_jd

7:50 pm on Oct 12, 2010 (gmt 0)

10+ Year Member



I had the same problem years ago and it turned out that the order is significant.

It seems that in this:

Order allow,deny
deny from $somewhere
allow from all

the deny line is either ignored or overriden by the following allow line without complaining. Apparently this obscure behaviour is by design, whether intentional or not. For some reason this is still not well documented.

IIRC, the order needs to be swapped from allow,deny to deny,allow. Ther emay be more but that's the best I can remember at the moment.

In re scanning, there is a bot that searches for specific files to identify server software and OS running on a website. Sorry, I don't recall the specifics but googling the IP address will show what you want to know.

sublime1

3:37 pm on Oct 13, 2010 (gmt 0)

10+ Year Member



I concur with logic_of_jd's response with this caveat -- if it doesn't work, make sure to comment out the section as through some bizarre twist of logic you could end up allowing only the IP address you ban and denying everyone else.

Tom

grandma genie

5:04 pm on Oct 13, 2010 (gmt 0)

10+ Year Member



So, should the correct sequence be:

order deny,allow
deny from 91.21
deny from 72.84.49.nnn
allow from all

As for the 91.21.74.nn IP, I have taken it off the list. The visitor passed muster.

Jeannie

logic_of_jd

8:57 pm on Oct 13, 2010 (gmt 0)

10+ Year Member



Yes, that is how I remember the order should be.

Might be helpful to experiment with it to understand it better. I'm going to try it on a test server and see what I come up with.

jdMorgan

5:48 pm on Oct 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You should never use "Allow from all" with "Order Deny,Allow" as the Allow then overrides and defeats all of the Denys. Use "Allow from" with "Order Deny,Allow" only to make exceptions for a few IP addresses within a larger blocked range, or to allow all IP addresses to fetch your robots.txt and custom 403 error pages -- both strongly recommended exceptions.

SetEnvif Request_URI "^(robots\.txt|custom-403-error-page\.html)$" AllowAll
#
Order deny,allow
Deny from 91.21
Deny from 72.84.49.nnn
Allow from env=AllowAll

Note that "Order" has nothing at all to do with what order Denys and Allows are listed in the file. It has rather to do with whether Allows override Denys or vice-versa. This overriding is not controlled by the order of the two groups. Rather, with "Order Deny,Allow", the Allows always override the Denys regardless of directive placement order.

An example of using "small" Allows to override "large" Denys might be:

Order Deny,Allow
# Deny Comscore via Savvis
Deny from 64.210.192.0/18
# Allow eight Girafa thumbnailing 'bot IP addresses
Allow from 64.210.196.192/29

Here, we deny a large range of unwelcome IP addresses, but we punch a small hole in the wall because we want to allow thumbnails of our pages to be available for use in some search engine listings.

I mean this example only to illustrate the technical point; The site from which this code was taken enforced this policy at the time this code was written, but I'm not recommending that specific policy here. I'm only illustrating the point in context.

Jim

grandma genie

7:26 pm on Oct 15, 2010 (gmt 0)

10+ Year Member



My htaccess file was originally set up by my web host. He put in the first line, which was the

order allow, deny
deny from 66.666.666.nn
allow from all

I have been adding to it since then and there are quite a few IPs there now.

I also have the errordocument notation for 404s after the Allow from all

following sections are all RewriteCond:

The RewriteCond %{HTTP-USER_AGENT} section
The blocked hotinked images is next
Then the redirect request for site names is last.

Everything seems to be working as written. The only time I noticed a problem was when I had added the 91.21.74.nn. For some odd reason that one was getting an http 200 when visiting. I don't know why, but I have removed that one from the list.

I do not use a custom 403 error page. I do have a robots.txt file. Can you mix and match the SetEnvIf directives with RewriteCond? Or does everything have to be rewritten to follow the SetEnvIf directives?

Jeannie

jdMorgan

7:41 pm on Oct 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Directives in your .htaccess file are not processed in strict order of appearance in the file. Rather, you .htaccess file is scanned by each Apache module in turn, with each module interpreting and applying only those directives that it understands. Therefore, it is a mistake to view .htaccess code as a sequentially-executing "program." But also therefore, it makes no difference how you arrange the directives addressed to each module, leaving you some freedom to "organize" the file as you see fit.

So only the order of directives addressed to the same modules makes any difference...

The SetEnvIf I posted is intended to set the "AllowAll" variable for subsequent use by the "Allow from env=AllowAll" directive. You could also use that AllowAll variable in a RewriteCond, although I can't think of a compelling use for it.

And in addition, you could put that SetEnvIf all the way at the end of the file, and it would still pass the AllowAll variable to both mod_access and to mod_rewrite because on a properly configured server, mod_setenvif runs before either of those two other modules.

I still have no idea why your deny of that specific IP address did not work. It was not due to "magic" or error (your code looked correct). It was undoubtedly due to some other directive(s) in some other piece of code (either here in .htaccess or in a server config file) that we could not see.

Jim

grandma genie

8:29 pm on Oct 15, 2010 (gmt 0)

10+ Year Member



This is the way that IP first accessed my site:

91.21.74.nn - - [11/Oct/2010:16:05:34 -0400] "GET /dogs/dogs.html HTTP/1.1" 200 4254 "h**p://suche.t-online.de/fast-cgi/tsc?q=dogs&encQuery=dogs&lang=any&mandant=toi&device=html&dia=suche&context=img&tpc=internet&ptl=std&classification=portal&start=0&num=10&ocr=yes&type=all&sb=top&more=none" "Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 ( .NET CLR 3.5.30729)"

The odd referrer is what caught my attention, but I think it was just some sort of search engine. But I don't think it could have allowed the visitor to bypass the htaccess directives. They were just looking for pictures of a certain breed of dog. I was going to block them, but found the 91.21 already in my htaccess file. I also have 91.211.16 listed. I can't see any reason for the visitor getting http 200 when I had them blocked. This is the first time I've seen something like this happen. If I figure it out, I'll post the reason in the forum.
Thanks for your help, Jim.
-- Jeannie

jdMorgan

11:56 pm on Oct 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Nothing can allow a request to "bypass" the directives. But another directive could override that access-control directive. There is no "magic" here -- It's all deterministic, and the cause of the problem is either another directive "somewhere" and/or an error. But that error (if there is one) is not in your "Deny from" line.

Also, be aware that some servers are set up to log the initial status, and not the final one. This visitor only "really" got a 200-OK if the byte count of 4254 matches the size of the /dogs.html page. If that byte count matches an error page, then that's more likely what he got.

Jim