Forum Moderators: phranque
Using our root htaccess file we block a large ISP (deny from 123.0.0.0/8) because it houses troublesome scrapers, bots, etc, but we've had some legitimate users complaining because they can't access our site.
So, assuming they use a static IP, can I do this... to allow just the friendly visitors on a dodgy ISP?
deny from 123.0.0.0/8
allow from 123.456.111.111
allow from 123.11.
<Files 403.html>
order allow,deny
allow from all
</Files>
Seem to recall Jim providing an example of using "detailed sub-ranges" in mod_access (which I don't use for such seperations), rather than mod_rewrite.
(your "456" use (although a forum practice) is an invalid example for this purpose. I've replaced with a 123 Class B)
It's worth noting that I wouldn't extend such an effort to accomplish access for a solitary Class D range, unless it was a good customer or other priority criteria.
RewriteEngine on
#deny 123 Class A, while allowing Class B exception and Class D eceptions
#Allow Class B 111 and 123)
RewriteCond %{REMOTE_ADDR} ^123\.([0-9]¦10¦1[2-9]¦[2-9][0-9]¦1[01][0-9]¦12[012]¦12[4-9]¦1[3-9][0-9]¦2[0-5][0-9])\ [OR]
RewriteCond %{REMOTE_ADDR} ^123\.123\.([0-9]¦[1-9][0-9]¦10[0-9]¦110¦11[2-9]¦1[2-9][0-9]¦2[0-5][0-9])\. [OR]
#allow Class D 111
RewriteCond %{REMOTE_ADDR} ^123\.123\.111\.([0-9]¦[1-9][0-9]¦10[0-9]¦110¦11[2-9]¦1[2-9][0-9]¦2[0-5][0-9])$
RewriteRule .* - [F]
Please Note; In the event the lines are the only Rewrite lines than the last [OR] requires omission (as provided above).
Please Note; Forum breaks pipe character and they ALL require correction prior to use.
(Please note 2nd and disclaimer; I've been making silly numerical omissions of late in these conversions).
So, assuming they use a static IP, can I do this... to allow just the friendly visitors on a dodgy ISP?
deny from 123.0.0.0/8
allow from 123.456.111.111
allow from 123.11.
<Files 403.html>
order allow,deny
allow from all
</Files>
Point is, if you change Order, be sure to watch your logs for unintended effects, including unwanted visitors.
I've some explanation in the application of more effective methods, however they require Deny,Allow. As a result the methods are not an option for me.
I do use Deny,Allow in some lesser (two) subfolders.
Allow,Deny means that Denys can override Allows, while Deny,Allow means that allows can override Denys.
(Remember the the listing-order of the individual "Allow from" and "Deny from" directives in your file is meaningless -- All of the Allows are always processed first when Allow,Deny is specified, and all of the Denys are processed first when Deny,Allow is specified, regardless of the order in which the individual Allow and Deny directives appear in your file.)
The reason that you see Deny,Allow in use for "more effective methods" is that it allows, for example, for serving custom 403 error documents and robots.txt even to denied IP address ranges.
In the case of custom 403 error documents, either you can't use them, you must allow them to be served in all cases, or you must put up with a self-inflicted denial-of-service attack whenever an initial 403 triggers a cascade of 403's as your server tries and fails to deliver the custom 403 page. Since that page --like all others-- is denied to that requestor, each attempt to serve it generates yet another 403.
In the case of robots.txt, many user-agents receiving anything other than a 200-OK with a robots.txt file containing a valid Disallow will assume that they have carte-blanche to spider your site, and again this usually results in a lot of subsequent 403s as they come back again and again trying to retrieve pages.
For these reasons, Deny,Allow tends to be the preferred solution.
Jim
The reason that you see Deny,Allow in use for "more effective methods" is that it allows, for example, for serving custom 403 error documents and robots.txt even to denied IP address ranges.
Many thanks Jim.
ALL the IPS I have in mod_access, which are denied (and there are more than a few (especially Class A's) are not offered robots.txt, as is my intention. I'll never change that practice.
The process of including an IP range in mod_access or mod_rewrite is a choice I make based on my own objectives. (I should likely include all Colo's in mod_access denials, however I've not).
whenever an initial 403 triggers a cascade of 403's as your server tries and fails to deliver the custom 403 page. Since that page --like all others-- is denied to that requestor, each attempt to serve it generates yet another 403.
Jim,
As your aware, I had a recent issue with this "loop", however. . .the loop was limited to my mod_rewrite denials and NOT my mod_access denials.
There may be some rhyme or reason for this difference, however under my current htaccess and/or host, there is a difference.
Don
deny from 123.0.0.0/8
allow from 123.456.111.111
allow from 123.11.
<Files 403.html>
order allow,deny
allow from all
</Files>
Here's an example of the method that I prefer to use for maximum flexibility and compliance with the robots.txt standard:
SetEnvIf Request_URI "403\.html$" AllowAll
SetEnvIf Request_URI "robots\.txt$" AllowAll
#
Order Deny,Allow
#
Deny from 123.0.0.0/8
#
Allow from env=AllowAll
Allow from 123.145.111.111
Allow from 123.11.