Forum Moderators: phranque

Message Too Old, No Replies

spider trap htaccess - SECURITY ISSUE

exposes htacess to public, need help to fix

         

amznVibe

9:17 am on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You know the infamous spider trap that was invented around here?
I found a huge security bug with it tonight.
IT EXPOSES THE .htaccess file TO THE PUBLIC
(and probably some other files that should not be seen, I need to test more)

Any apache server that uses this code:

<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
which is required by the trap, is exposed.

Apparently the files * overrides basic apache security not to let the file be seen.
Both apache 1.3 and 2.0 are affected.

Try it on your server if you are using the code:
http://example.com/.htaccess
or just create a stub .htaccess with the above code and give it a shot.

This means it could expose all rewrites, get cached on google, etc.

I would like to hear from the apache experts asap how to work around this problem?
TIA!

The Contractor

11:52 am on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



could you not add the following to your .htaccess file:

<Files .htaccess>
deny from all
</Files>

amznVibe

12:03 pm on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes but that's a patch to a bad rule in the first place.
Apparently "Files *" overrides Apache's internal security so there are bound to be other internal apache files affected by this, group lists, password lists, (.htpasswd) etc.

I could even block any file that starts with "."

<Files ~ ^\.>
deny from all
</Files>

but that's not the point - there has to be a better rule than "Files *"?
I bet jdMorgan can come up with something as soon as he gets around here...

amznVibe

12:31 pm on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I found an interesting httpd.conf discussion where they were talking about improving apache's default blocking to this:

[bugs.debian.org...] ~ "^\.¦^_¦RCS¦CVS¦,¦~¦#">
Order allow,deny
Deny from all
</Files>

Here's what the new patterns are intended to match:

.* is for .htaccess, .htpasswd, .svn, Emacs locks, etc.
_* is for _vti_* and other FrontPage files.
, is for RCS files and some backup files.
~ is for Emacs backup files.
# is for Emacs lock files and backup files.

jdMorgan

1:57 pm on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Part of the problem is that we talk in 'snippets' around here. People show 'examples' and not finished code, mainly because almost every server is different from every other server. Showing 'finished' code is practically impossible.

Assuming you've got a list of IP addresses to deny (by setting 'getout'), then protecting the .ht files should be as simple as adding one more line:


SetEnvIf Remote_Addr 192.168.0.1 getout
...
SetEnvIf Remote_Addr 192.168.0.23 getout
[b]SetEnvIf Request_URI "\.(htaccess¦htpasswd)$" getout[/b]
SetEnvIf Request_URI "robots\.txt$¦403error\.html$" allowsome
<Files *>
Order Deny,Allow
Deny from env=getout
Allow from env=allowsome
</Files>

There's nothing particularly magical about this situation; It's simply that .htaccess can do a per-directory override of previous server configuration settings. If you've told it to allow access to <Files *> (all files) from all IP addresses/user-agents/etc not in the deny list, then that's what it will do.

Jim

amznVibe

2:21 pm on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Ah good approach. It fixes the unlock in a cleaner way.
I think I will take the more complete block list though.
I wonder why they did RCS¦CVS
based on that (lack of) pattern, it would stop any filename with those characters?
I think they meant extensions.

Unrelated to the security issue but since we are on the same snippet I think I am going to add this to modernize the block against firefox/mozilla pre-fetching:
SetEnvIf X-moz prefetch getout
whatcha think of that approach?
Or is giving the browser a deny error a bad idea?

jdMorgan

3:09 pm on Mar 21, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I don't know what the CVS / RCS stuff is for. Since I don't have any files of that name, I don't need those patterns.

X-moz prefetch: That should work fine. I've used a similar approach in both mod_access and mod_rewrite code, and the 403, 404, or 410 response doesn't hurt anything. A 501 or 503 might be more appropriate though.

Jim

Pfui

4:15 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I noticed the hole last week but wasn't sure if I'd just goofed up some code somewhere. I was stunned because we use the 'dot-file block' in httpd.conf and it's always worked. So thanks, amznVibe, for letting me know that it wasn't just my tpyos!

FWIW, here's what I put in .htaccess and it works for me in 1.3.x:

RewriteRule ^htaccess$ [127.0.0.1...] [R,L]
RewriteRule ^\.htaccess$ [127.0.0.1...] [R,L]
RewriteRule ^(.*)/\.htaccess(.*) [127.0.0.1...] [R,L]

RewriteRule ^htpasswd$ [127.0.0.1...] [R,L]
RewriteRule ^\.htpasswd$ [127.0.0.1...] [R,L]
RewriteRule ^(.*)/\.htpasswd(.*) [127.0.0.1...] [R,L]

Actually I'm not sure which lines work -- which line(s) of each triplet, I mean. (I was freaked out enough that .htaccess was visible that I wrote up all of my possible permutations and something stuck A-OK, both top-level and sub-dirs.) So please don't use those without knowing what you're doing.

Key_Master

4:38 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member




SetEnvIfNoCase Request_URI \.ht(access¦passwd)$ ban

The .htaccess file has nothing really to do with a spider trap script that utilizes the security features of Apache. The administrator is responsible for securing files on their server.

Pfui

6:07 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Key_Master, pardon me if this non SysAdmin is missing something but my server's dot-files have been locked down for years via httpd.conf (below) and they only became visible in one site -- the one with a <Files *> directive in .htaccess.

What are your thoughts about the visibility of dot-files all of a sudden by at least two of us? Might something else have been inadvertently overridden?

-----
#
# The following lines prevent .htaccess files from being viewed by
# Web clients. Since .htaccess files often contain authorization
# information, access is disallowed for security reasons. Comment
# these lines out if you want Web visitors to see the contents of
# .htaccess files. If you change the AccessFileName directive above,
# be sure to make the corresponding changes here.
#
# Also, folks tend to use names such as .htpasswd for password
# files, so this will protect those as well.
#
<Files ~ "^\.ht">
Order deny,allow
Deny from all
</Files>

#

Key_Master

6:21 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It probably has been overwritten assuming your virtual server configuration overrides http.conf.

Your server contains a public directory that serves files to browsers that visit your site. You should block any access to files you don't want served to the public. Not just .htaccess or .htpasswd files, but any file you consider to be sensitive (ie stats pages, password files, logs, etc).

amznVibe

6:26 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well it really was a security issue because the httpd.conf setting
was overridden by the <files *>

Normally apache will hide files set to chmod 640 but I couldn't understand why I could directly see htaccess until I traced it to that.

Easy to fix but the point is it does need to be fixed and there are probably dozens of webmasters out there that copied the spider trap code verbatim and did not realize the issue. I've been running some sites for two years with that code so that's not a good feeling at all.

Key_Master

7:01 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



SetEnvIfNoCase Request_URI \.ht(access¦passwd)$ ban

This above code has been mentioned numerous times over the years, including the original trap.pl script I shared with the forum in 2002. You just haven't been paying attention. :) See Msg #15:

[webmasterworld.com...]

<Files *> isn't the problem. It doesn't override http.conf, your virtual host configuration in http.conf is the root cause of the problem. And like you said, it's easy to fix.

amznVibe

7:02 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



By the way, is Opera 9.0 now doing some kind of pre-fetching too?
I'm starting to notice alot of user-agents with Opera 9.0 being banned
so it's concerning me. Do we need to add code to account for that?

amznVibe

8:03 am on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Grrr, here's a script some Opera 8 + 9 users are installing
which prefetches but doesn't check robots.txt and there are no special header flags.

[userjs.org...]

Apparently that's why they are getting banned.

Did you know that the fasterfox extension for Firefox actually checks robots.txt?
Good move on their part.

Pfui

4:26 pm on Mar 22, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Excerpted from the Fasterfox FAQ [fasterfox.mozdev.org] --

I'm a webmaster, how can I prevent prefetching?

Because some websites may not have the resources available to support the enhanced prefetching feature, it may be easily blocked by webmasters.

Prior to generating any prefetching requests, Fasterfox checks for a file named "robots.txt" in your site's root directory (subdirectories are not checked). If this file contains the following 2 lines, no prefetching requests will be made to your domain:

User-agent: Fasterfox
Disallow: /

<<

I don't know if that check can be disabled but three cheers it exists at all!

By the way, I found that Fasterfox was heeding my robots.txt without the Fasterfox-specific UA. (Aside: I'm guessing it was Fasterfox because all the browsers suddenly asking for robots.txt, and nothing else, were Firefox.)

Now to figure out which Firefox extension/whatever makes HEAD requests for robots.txt, then pillages anyway...