Forum Moderators: phranque

Message Too Old, No Replies

htaccess

Bet I got it wrong

         

tangor

1:08 am on Oct 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I've been fiddling with (and not really knowing what I'm doing) htaccess over the last year. It's not doing EXACTLY what I want and I finally (hammer on head) think I know why. Hoping for a confirmation from the gurus...

It's the files section

Here's what I have one line above the section:

order deny,allow
<Files ~ "\.htaccess$">
deny from all
</Files>

<Limit GET POST>

</Limit>

<Limit PUT DELETE>
deny from all
</Limit>

<Files *>
header append X-robots-tag "noarchive"
deny from (long list fully obfuscated)

deny from env=ban
</Files>

Am I correct that there can only be one files statement and anything I want to deal with files has to be contained in that section? Obviously I want to protect htaccess, I also want to deny about a dozen persistent bad actors. Is the above structure incorrect? If so, what should I do to get things on track?

Probably pull the limit out of the middle. Also wondering if the deny from env=ban needs to be inside the files statement or if I need several (since my SetInEnv statements are working with two in place).

jdMorgan

1:54 am on Oct 23, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"<Files *>" doesn't really do anything, except to help control priority in resolving "overlapping" or conflicting Allows and Denys in <Directory> <Location> and <Files> -- See How Directory, Location and Files sections work [httpd.apache.org] for more info.

I think you can simplify somewhat, and then add allowances for robots.txt and your custom 403 error page (if you have one), both of which should never be restricted to avoid problems such as having your server repeatedly hammered by a robot that should be Disallowed by robots.txt, but is coming from a banned IP address range, cannot fetch robots.txt, and so doesn't know it is Disallowed from spidering... or having your server attempt to serve a 403 response using your custom 403 error page, but then encountering a second 403 error because the 403 page itself is denied, and then another, and another, etc. (I call these two cases "self-inflicted denial of service attacks.")

Try something like:


Order deny,allow
#
<Files ~ "\.htaccess$">
Deny from all
</Files>
#
<Limit PUT DELETE>
Deny from all
</Limit>
#
Deny from (long list fully obfuscated)
Deny from env=ban
#
<Files ~ "^(robots\.txt¦your-custom-403-error-page-if-any\.html)$">
Allow from all
</Files>
#
Header append X-robots-tag: "noarchive"

Replace the broken pipe "¦" character with a solid pipe before use; Posting on this forum modifies the pipe characters.

Jim

tangor

3:53 am on Oct 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks for the reply. I see a few logic things that will be helpful in future. I've implemented and will see how it works this week.

Don't offer a custom 403 so the following is correct?

<Files ~ "^(robots\.txt)$">
Allow from all
</Files>

jdMorgan

12:51 pm on Oct 24, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You can remove the parentheses, since there are no longer multiple ORed terms sharing the end anchors.

Jim