homepage Welcome to WebmasterWorld Guest from 54.234.2.94
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
Generate 404-Not Found response
rewrite rule to 404 all but allowed file types
MattyUK




msg:3138970
 5:38 pm on Oct 29, 2006 (gmt 0)

Hi

I wish to prohibit all BUT certain file types from the browsers using htaccess.

For example: I may to allow only .html and .htm files but NOT .php Requests for anything .php should fail with a 404 REGARDLESS of if the file actually exists or not.

In this way a high level htaccess can prevent browser access to prohibited files even if they exist in the document root, whilst allowing access to known/safe file types.

Its part of a defense in depth measure.

Here is my attempt. But it causes a 500 error

#Allowed file extensions, if not in this group request suffers a 404
#RewriteCond %{SCRIPT_URL}!^$ [NC,OR]
#Above allows directory default page requests i.e. downloads/
#RewriteCond %{SCRIPT_URL}!\.php[5]?$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.php$ [NC,OR]
#RewriteCond %{SCRIPT_URL}!\.htm[l]?$ [NC,OR]
RewriteCond!%{SCRIPT_URL}!\.html$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.htm$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.txt$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.css$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.pdf$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.zip$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.gz$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.jp[e]?g$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.gif$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.png$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.bmp$ [NC,OR]
RewriteCond %{SCRIPT_URL}!\.ico$ [NC]
#RewriteRule ^(.*)$ [L,R=404]
#RewriteRule ^(.*)$ - [L,R=404]
RewriteRule ^(.*)$ /error404.html [L,R=404]

Note normally a space bewtween } and! (}! = }<1 white space>!)
Commented lines represent previous attempts and removals to try and make it work.

So few things:
1) Can anybody help me make this rule work and/or explain where I'm going wrong? Am I best using using %{SCRIPT_URL}? Is ther ea better approach?
2) I tried finding the section in the docs but can't find the rule flags section. Can somebody show me where to lookup what [F,L,R=code] are please?

Ideally. I want to stay with the RewriteCond,rule structure rather than a single ¦'ed RewriteRule. Easier to read and modify later.

Thanks in advance. I couldn't find a similar post so please excuse me if its around somewhere.

Matt

 

MattyUK




msg:3139021
 6:44 pm on Oct 29, 2006 (gmt 0)

Just realised I'd need to remove the OR flags and don't need the capture brackets.

New version becomes:


#Allowed file extensions, if not in this group request suffers a 404
#RewriteCond %{SCRIPT_URL}!^$ [NC]
#Above allows directory default page requests i.e. downloads/
RewriteCond %{SCRIPT_URL}!\.php5?$ [NC]
RewriteCond %{SCRIPT_URL}!\.html?$ [NC]
RewriteCond %{SCRIPT_URL}!\.txt$ [NC]
RewriteCond %{SCRIPT_URL}!\.css$ [NC]
RewriteCond %{SCRIPT_URL}!\.pdf$ [NC]
RewriteCond %{SCRIPT_URL}!\.zip$ [NC]
RewriteCond %{SCRIPT_URL}!\.gz$ [NC]
RewriteCond %{SCRIPT_URL}!\.jpe?g?$ [NC]
RewriteCond %{SCRIPT_URL}!\.gif$ [NC]
RewriteCond %{SCRIPT_URL}!\.png$ [NC]
RewriteCond %{SCRIPT_URL}!\.bmp$ [NC]
RewriteCond %{SCRIPT_URL}!\.ico$ [NC]
#RewriteRule ^(.*)$ [L,R=404]
#RewriteRule ^(.*)$ - [L,R=404]
RewriteRule ^(.*)$ /error404.html [L,R=404]

However even with the above I still get the dreaded 500 internal server error.

A point to note. I'm trying to retain the $ end of script since REQUEST_URI returns GET variables in the URL and I didn't want people bypassing the rule by appending a valid file type in the query string.

ie.
blah.vbs?value=.css
blah.vbs
Should both get a 404 even if blah.vbs really exists.

Matt

[edited by: MattyUK at 6:54 pm (utc) on Oct. 29, 2006]

jdMorgan




msg:3139040
 7:19 pm on Oct 29, 2006 (gmt 0)

Man, that looks like the hard way!

To 404 anything but these filetypes (and including appended query strings), just rewrite the requests to a non-existent file -- as long as this non-existent file is one of the 'allowed' types (avoids a rewrite loop):

# Allowed file extensions; If not in this group, respond with 404-Not Found
RewriteRule !(^$吒.(php5?多tml?宇xt圭ss如df屹ip夙z夸pe?g夙if如ng在mp夷co)$) /this_filepath_does_not_exist.html [L]

I'm not sure why you want to 404 these requests, though, as the correct HTTP response would be a 403-Forbidden. If the file exists, I would not recommend responding with a 404 or 410-Gone, simply because these responses are inaccurate and 'false.' For best results with search engines, always use the response codes exactly as defined. I'd prefer:

# Allowed file extensions; If not in this group, respond with 403-Forbidden
RewriteRule !(^$吒.(php5?多tml?宇xt圭ss如df屹ip夙z夸pe?g夙if如ng在mp夷co)$) - [F]

As always replace all broken pipe "" characters above with solid pipe characters before use; Posting on this forum changes the pipe characters

[Added] The main problem is that [R=404] is not among the documented flag values for RewriteRule, and it won't work. [/added]

Jim

[edited by: jdMorgan at 7:22 pm (utc) on Oct. 29, 2006]

Birdman




msg:3139054
 7:34 pm on Oct 29, 2006 (gmt 0)

How about a different approach. This is much cleaner, IMO. Simple enter the filetypes that you don't want accessible.

<Files ~ "\.(sh¦cgi¦pl¦py)$">
order allow,deny
deny from all
</files>

jdMorgan




msg:3139100
 8:49 pm on Oct 29, 2006 (gmt 0)

That's certainly shorter, but the question was asked:

> I wish to prohibit all BUT certain file types from the browsers using htaccess.

Jim

Birdman




msg:3139118
 9:26 pm on Oct 29, 2006 (gmt 0)

Yeah, you're right jd. I hadn't even seen your post when I posted. I hope you didn't think I meant it was cleaner than your solution :)

jdMorgan




msg:3139165
 10:37 pm on Oct 29, 2006 (gmt 0)

Well, it may indeed be shorter and cleaner, if we ignore the differences between mod_access and mod_rewrite, but I just like to keep the answers focused on the 'original requirements.'

My post was simply to point out to those who read later that there's a difference between "choosing what to allow" versus "choosing what to deny." This difference causes a lot of confusion when discussing, for example, user-agent blacklisting versus user-agent whitelisting.

The two factors to consider when choosing are:

  • Which method is easier to maintain over the long term?
  • Which has the lowest impact if there is an error or omission in the 'list'?

    Only the Webmaster can decide which approach is better for his/her site after examining those criteria.

    The most common mistake in the design of software that must be secured against malicious or erroneous user input (e-mail, comment, forum, and blog scripts, for example) is to use the blacklist approach; If the author forgets to harden the code against just one exploit, then that's the one that the bad guys will find and use. Therefore, a whitelist approach --deciding what to allow-- *is* often a better idea.

    Jim

  • Global Options:
     top home search open messages active posts  
     

    Home / Forums Index / Code, Content, and Presentation / Apache Web Server
    rss feed

    All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
    Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
    WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
    © Webmaster World 1996-2014 all rights reserved