Forum Moderators: phranque

Message Too Old, No Replies

RewriteCond to ignore specific file types

         

Elijha

10:22 am on Dec 8, 2009 (gmt 0)

10+ Year Member



Ok I admit it I really can't get my head around regular expressions, and it's really effecting my ability to learn and understand Apache .htaccess

Basically I was using a .htaccess 404 to redirect domain/url/everything to a php script. I knew this was bad as everything was 404'ing in the logs and Search engines will persecute me for it in the long run I knew their were other ways but everything was working and I kept putting it off then searching the net I found it amazingly hard for a simple explanation or tutorial that would help me.

Then I found this [webmasterworld.com...] forum and it seemed like the perfect solution. It's seriously hard to find anyone commenting on how to redirect URL *directory* requests to a script and file requests to a file and if *no file* 404 it, and thats what I thought the code did, but it dosn't for me.

If I use


Options +FollowSymLinks
# Handle real errors with index.php
ErrorDocument 403 /index.php
ErrorDocument 404 /index.php
#
Order Deny,Allow
<FilesMatch "\.ht(access¦passwd)$">
Deny from all
</FilesMatch>
#
Options +FollowSymLinks
RewriteEngine on
RewriteCond %{REQUEST_URI} !^/index\.php$
RewriteCond %{REQUEST_URI} !^/robots.txt$
RewriteCond %{REQUEST_URI} !\.(gif¦tif?f¦bmp¦jpe?g¦png¦css¦js¦pdf¦doc¦swf¦xml¦zip¦mov¦avi¦doc)$ [NC]
RewriteRule (.*) /index.php [L]
ErrorDocument 500 "<h2>Site Error</h2>A Temporary error has occured"

the apache logs show the urls redirect fine but files that exist get 302 and sent to the php also eg'

"GET /Home HTTP/1.1" 200 6081
"GET /Scripts/supersleight.js HTTP/1.1" 302 2567
"GET /styles/master.css HTTP/1.1" 302 2567
"GET /Scripts/swfobject.js HTTP/1.1" 302 2567
"GET / HTTP/1.1" 200 6081
"GET / HTTP/1.1" 200 6081
"GET / HTTP/1.1" 200 6081
"GET /styles/master.css HTTP/1.1" 302 2567
"GET / HTTP/1.1" 200 6081

So somethings not right - the condition I thought would match requests ending with .'ext' (where 'ext' is the list of file extension I put in bmp,jpg.. etc) and therefor -not- redirect but let the request go through naturally 200 on exists and 404 on non-exist, was either not matching for some reason or was matching and still going to the php (contrary to the forum link above which I'm sure indicated it was to ignor matches)

so I inserted this specific rewrite rule first above the php rule and made it last if matching (L) and do nothing (-)


RewriteRule \.(gif¦jpe?g¦png¦ico)$ - [NC,L]

Not all the file were listed in it as it was just a test but it fails for example /favicon.ico or /images/title.jpg still go to the index.php

For now I have added in some ignore conditions on the php redirect rewrite rule so as to get everything working as most visual files are in these locations however I know their must be a better way and I'm still flustered when things like favicon.ico are in the root directory as they won't show. Heres what I've added to ignore directories with static files in them


RewriteCond %{REQUEST_URI} !^/images/
RewriteCond %{REQUEST_URI} !^/Scripts/

I'd be very grateful if anyone has any suggestion or explanation as to where I'm wrong (definatley like to at least find that out)

E
PS sorry for being overly verbose.

jdMorgan

1:50 pm on Dec 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Well, your code does not redirect, it does an internal rewrite. Yes, I'm being picky about terminology here, because it's critical to not missing an important clue: You're seeing a 302-redirect when in fact your code doesn't invoke any redirects at all... It only rewrites incoming URL requests to a different filepath than those URL requests would otherwise resolve to.

And your "temporary-testing bypass rule" should certainly have worked for the filetypes it included, even if it wasn't comprehensive.

So, the solution to your mystery is that there's nothing wrong with your rule, and that the 302 redirect response is being invoked by some other module -- or perhaps by your script itself.

Note that the 302 target file is always 2567 bytes regardless of the requested URL -- What file is that size, and does knowing that file's name give you any clue?

First step: If you're not using MultiViews (content-negotiation) turn them off in your Options directive. (Also note that you only need one "Options" directive per .htaccess file, so get rid of that duplicate one. You might also consider sorting all of your directives into sections by module for readability/maintainability, since that's how they'll execute -- e.g. locate all of the ErrorDocument lines together.)

Second, if you're on Apache 2.x, turn off AcceptPathInfo unless you need it.

Both of these directives' modules can change the server's URL-to-filepath mapping, so if your site doesn't need them, get rid of them. The "Redirect" and "Alias" directives of mod_alias might also be involved if you're using them.

Also, a long shot: Posting on this forum modifies the pipe characters, changing solid pipes to broken pipe "¦" characters. If you copied any code from here, that could be contributory. Make sure all of the pipe characters in your regex patterns are solid; If not, re-type them from your keyboard.

See if any of that helps, and please post back either way. There are some additional things to try, but I'd rather not over-complicate the thread with speculation.

Jim

Elijha

2:42 pm on Dec 8, 2009 (gmt 0)

10+ Year Member



Also, a long shot: Posting on this forum modifies the pipe characters, changing solid pipes to broken pipe "¦" characters. If you copied any code from here, that could be contributory. Make sure all of the pipe characters in your regex patterns are solid; If not, re-type them from your keyboard.

Not a long shot at all I did copy the original code from this forum and thinking the author knew some trick I didn't had deliberately used the broken instead of solid pipe as I had seen(.. I don't know maybe it meant case-insensitve 'OR' ...or somthing :) ).

And your "temporary-testing bypass rule" should certainly have worked for the filetypes it included, even if it wasn't comprehensive.

Changing to solid pipes, has this rule working absolutely fine. So I do now have a working solution. However seeking out a simpler rule file, if I comment/remove it and test the original (with solid pipes) it has the same result as always... however the files size in the access.log is different now.

First step: If you're not using MultiViews (content-negotiation) turn them off in your Options directive. (Also note that you only need one "Options" directive per .htaccess file, so get rid of that duplicate one. You might also consider sorting all of your directives into sections by module for readability/maintainability, since that's how they'll execute -- e.g. locate all of the ErrorDocument lines together.)

Second, if you're on Apache 2.x, turn off AcceptPathInfo.


I'm running MAMP on an OS X box for development and in the end it's going on a shared hosting provider (which I don't currently have access to), so I don't think I'm going to have much control over the Apache settings on it. Admittedly apart from setting up some virtual hosting I've never delved deeply into the Apache configurations. However I tend to stay away from customising until I have a target to aim for (ie the hosting configuration) So for the moment I'd rather steer away from testing these options until I'm sure they are the problem.

Well, your code does not redirect, it does an internal rewrite. Yes, I'm being picky about terminology here, because it's critical to not missing an important clue: You're seeing a 302-redirect when in fact your code doesn't invoke any redirects at all... It only rewrites incoming URL requests to a different filepath than those URL requests would otherwise resolve to.

Yes actually that had me a bit confused too, seeing the second requests in the access.log as I suspected I should not, but I'm clearly am a bit vague on all this. Generally I'm a quick learner - but it is now pushing into the early hours of the morning.

Thank you very much for your help. Even though it might have been very much a 'newb' mistake I've learned a lot and have some further areas for investigation should I need to. However since I do have a working .htaccess file with the 'temporary' rule I'm very tempted to (remove the offending line and leave in this new rule) put this problem to the bottom of my list and continue to fix a glut of other issues. I hope that doesn't sound rash or insensitive, I'm sure we have all been their.

E

jdMorgan

4:06 pm on Dec 8, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



"Options -MultiViews" and "AcceptPathInfo off" are both configurable at the .htaccess level, and are recommended unless you know you need those features.

This wasn't a 'newb' mistake at all -- It was caused by the fact that we cannot allow "Unix piping" in the posts here. :)

Correct the pipes in your original rule, make sure that all rules end with at least an [L] flag, and you should not need the redundant "skip rule" at the top. It's partly a matter of 'style' but also, that's an extra rule which could have unwanted effects on later rules that you might wish to add at a later time. At the least, look into using the [S=nnn] flag instead of immediately ending mod_rewrite processing with [L] on that skip rule.

Jim