Forum Moderators: phranque

Message Too Old, No Replies

mod_rewrite question

Return 404 for files requested with an extension

         

dizzynutter

5:52 pm on Apr 11, 2005 (gmt 0)

10+ Year Member



Hi,
I'm new to this mod_rewrite so please excuse any messy scripting.

I have got the mod_rewrite to work creating clean urls, but I would also like to prevent any visitors from accessing files with extensions & have them redirected to say a 404 error.

Server Setup: ISP APACHE SERVER 1.3.29

## Sample of my .htaccess file ##
Options +FollowSymLinks
RewriteEngine on
RewriteBase /

# Deny Access to .htaccess file
RewriteRule ^.htaccess*$ - [F]

# Parse PHP files as html or htm
AddType x-mapp-php4 .html .htm

# Create shortcuts for following urls
RewriteRule^services/?$ /user/services.html
RewriteRule^contact/?$ /user/contact.html

## START: NOT ABLE TO GET THIS PART TO WORK
## If visitor requests * any file with an
## extension, redirect them to a 404

#RewriteCond %{REQUEST_FILENAME}.php -f
RewriteCond %{REQUEST_FILENAME} -f
#RewriteRule [L,R=301]
RewriteRule ^$1\.html [L,R=301]
## END: NOT ABLE TO GET THIS PART TO WORK

# Direct Page Errors to custom pages
Redirect 403 errordocs/403error.html

Any help would be appreciated,
Thanks,
Dizzy

jdMorgan

11:20 pm on Apr 11, 2005 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Dizzy,

Welcome to WebmasterWorld!

The problem here is that you're not necessarily looking at the user-requested URL when you test a URL-path in RewriteRule. That URL path may have been changed by a preceding rewrite, either in this .htaccess file, or in httpd.conf.

When any URL is rewritten in .htaccess, httpd.conf and any .htaccess files in the new path are re-invoked, in order to test for further rewrites or access restrictions. When that happens, the URL 'seen' by RewriteRule is updated to refect the new URL.

By using %{THE_REQUEST}, you can examine the URL that was originally requested by the visitor.


# Sample of my .htaccess file ##
Options +FollowSymLinks
RewriteEngine on
RewriteBase /
#
# Deny Access to .htaccess file[i]s in all directories and subdirectories[/i]
RewriteRule [b].htaccess$[/b] - [F]
#
# Parse PHP files as html or htm
AddType x-mapp-php4 .html .htm
#
# Create shortcuts for following urls
RewriteRul[b]e ^s[/b]ervices/?$ /user/services.html [b][L][/b]
RewriteRul[b]e ^c[/b]ontact/?$ /user/contact.html [b][L][/b]
#
# If visitor directly requests any file with an extension, [i]rewrite to a nonexistent file to create a 404[/i]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /[^.]+\.[^\ ]+\ HTTP/
RewriteRule . /file_that_does_not_exist.html [L]
#
# Direct Page Errors to custom pages
Redirect 403 errordocs/403error.html

The format of {THE_REQUEST} is pretty much what you see in your raw access log file:

GET /file_with.extension HTTP/1.1

The [A-Z]{3,9} pattern will catch GET, PUT, POST, PROPFIND, etc. Next, we match a literal space, then anything up to the 'dot' in the filename, then the 'dot', then the extension up to the next space, then a space, and finally "HTTP/" followed by anything else (usually "1.1" or "1.0").

Jim

sitz

2:56 am on Apr 12, 2005 (gmt 0)

10+ Year Member



Note also that this:
# Deny Access to .htaccess file
RewriteRule ^.htaccess*$ - [F]

...*probably* isn't necessary (but test it!), since the Apache's default config file ships with this:


<Files ~ "^\.ht">
Order allow,deny
Deny from all
</Files>

...which will return a 403 for any request for files starting with ".ht", such as ".htaccess" and ".htpasswd".