Forum Moderators: phranque
Any apache server that uses this code:
<Files *>which is required by the trap, is exposed.
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
Apparently the files * overrides basic apache security not to let the file be seen.
Both apache 1.3 and 2.0 are affected.
Try it on your server if you are using the code:
http://example.com/.htaccess
or just create a stub .htaccess with the above code and give it a shot.
This means it could expose all rewrites, get cached on google, etc.
I would like to hear from the apache experts asap how to work around this problem?
TIA!
I could even block any file that starts with "."
<Files ~ ^\.>
deny from all
</Files>
[bugs.debian.org...] ~ "^\.¦^_¦RCS¦CVS¦,¦~¦#">
Order allow,deny
Deny from all
</Files>
Here's what the new patterns are intended to match:
.* is for .htaccess, .htpasswd, .svn, Emacs locks, etc.
_* is for _vti_* and other FrontPage files.
, is for RCS files and some backup files.
~ is for Emacs backup files.
# is for Emacs lock files and backup files.
Assuming you've got a list of IP addresses to deny (by setting 'getout'), then protecting the .ht files should be as simple as adding one more line:
SetEnvIf Remote_Addr 192.168.0.1 getout
...
SetEnvIf Remote_Addr 192.168.0.23 getout
[b]SetEnvIf Request_URI "\.(htaccess¦htpasswd)$" getout[/b]
SetEnvIf Request_URI "robots\.txt$¦403error\.html$" allowsome
<Files *>
Order Deny,Allow
Deny from env=getout
Allow from env=allowsome
</Files>
Jim
Unrelated to the security issue but since we are on the same snippet I think I am going to add this to modernize the block against firefox/mozilla pre-fetching:
SetEnvIf X-moz prefetch getout
whatcha think of that approach?
Or is giving the browser a deny error a bad idea?
X-moz prefetch: That should work fine. I've used a similar approach in both mod_access and mod_rewrite code, and the 403, 404, or 410 response doesn't hurt anything. A 501 or 503 might be more appropriate though.
Jim
FWIW, here's what I put in .htaccess and it works for me in 1.3.x:
RewriteRule ^htaccess$ [127.0.0.1...] [R,L]
RewriteRule ^\.htaccess$ [127.0.0.1...] [R,L]
RewriteRule ^(.*)/\.htaccess(.*) [127.0.0.1...] [R,L]
RewriteRule ^htpasswd$ [127.0.0.1...] [R,L]
RewriteRule ^\.htpasswd$ [127.0.0.1...] [R,L]
RewriteRule ^(.*)/\.htpasswd(.*) [127.0.0.1...] [R,L]
Actually I'm not sure which lines work -- which line(s) of each triplet, I mean. (I was freaked out enough that .htaccess was visible that I wrote up all of my possible permutations and something stuck A-OK, both top-level and sub-dirs.) So please don't use those without knowing what you're doing.
What are your thoughts about the visibility of dot-files all of a sudden by at least two of us? Might something else have been inadvertently overridden?
-----
#
# The following lines prevent .htaccess files from being viewed by
# Web clients. Since .htaccess files often contain authorization
# information, access is disallowed for security reasons. Comment
# these lines out if you want Web visitors to see the contents of
# .htaccess files. If you change the AccessFileName directive above,
# be sure to make the corresponding changes here.
#
# Also, folks tend to use names such as .htpasswd for password
# files, so this will protect those as well.
#
<Files ~ "^\.ht">
Order deny,allow
Deny from all
</Files>
#
Your server contains a public directory that serves files to browsers that visit your site. You should block any access to files you don't want served to the public. Not just .htaccess or .htpasswd files, but any file you consider to be sensitive (ie stats pages, password files, logs, etc).
Normally apache will hide files set to chmod 640 but I couldn't understand why I could directly see htaccess until I traced it to that.
Easy to fix but the point is it does need to be fixed and there are probably dozens of webmasters out there that copied the spider trap code verbatim and did not realize the issue. I've been running some sites for two years with that code so that's not a good feeling at all.
This above code has been mentioned numerous times over the years, including the original trap.pl script I shared with the forum in 2002. You just haven't been paying attention. :) See Msg #15:
[webmasterworld.com...]
<Files *> isn't the problem. It doesn't override http.conf, your virtual host configuration in http.conf is the root cause of the problem. And like you said, it's easy to fix.
[userjs.org...]
Apparently that's why they are getting banned.
Did you know that the fasterfox extension for Firefox actually checks robots.txt?
Good move on their part.
I'm a webmaster, how can I prevent prefetching?
Because some websites may not have the resources available to support the enhanced prefetching feature, it may be easily blocked by webmasters.
Prior to generating any prefetching requests, Fasterfox checks for a file named "robots.txt" in your site's root directory (subdirectories are not checked). If this file contains the following 2 lines, no prefetching requests will be made to your domain:
User-agent: Fasterfox
Disallow: /
<<
I don't know if that check can be disabled but three cheers it exists at all!
By the way, I found that Fasterfox was heeding my robots.txt without the Fasterfox-specific UA. (Aside: I'm guessing it was Fasterfox because all the browsers suddenly asking for robots.txt, and nothing else, were Firefox.)
Now to figure out which Firefox extension/whatever makes HEAD requests for robots.txt, then pillages anyway...