Welcome to WebmasterWorld Guest from 54.144.246.252

Forum Moderators: Ocean10000 & incrediBILL & phranque

FilesMatch

   
6:21 am on Dec 6, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Can't think why I never thought of asking about this before...

Most of my RewriteRules are extension-specific. Pages, or images, or et cetera.

What would happen if I took each of those sets of rules and shoved them inside FilesMatch envelopes? If it doesn't end in .xtn, the server skips the whole package. Doesn't even look at the rule, let alone the conditions. Would it be faster, slower, fatal error, no difference?

Corollary question: Does FilesMatch mean FilesMatch or, uhm, URLMatch? In particular, would a directory match / or .html or both? In my current setup, mod_dir seems to execute before mod_rewrite. But I wouldn't want to rely on that.
11:09 pm on Dec 6, 2012 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



FilesMatch refers to the filesystem, not webspace.

Configuration Sections - Apache HTTP Server:
http://httpd.apache.org/docs/2.2/sections.html#file-and-web
1:27 am on Dec 7, 2012 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



Ah. Kinda thought so. Makes it simpler. Or possibly not, since some rules are concerned with requests, independent of whether there's a real file at the far end.

What do you bet the primary question was answered by g1smd 12 hours ago, promptly eaten by the server, and now he hasn't got the energy to answer it all over again? :)
1:21 pm on Dec 7, 2012 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



To answer your original question: Faster in FilesMatch containers. Think of all the requests that will not have to run through the expression engine again for analysis!
6:33 am on Feb 7, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: bump ::

I did a little more experimenting on this. MAMP and test site, not real life.

The plus side to working on "real" files is that you don't have to go through all those (/|\.html)$ contortions that you go through when a request might be for either a directory or a secondary page.

But then the rule will only kick in after mod_dir has done its stuff, instead of intercepting requests on their first pass through your htaccess. And you can't slam the door on people asking for nonexistent files; all you can give them is your basic 404. This is not as satisfying. (Also potentially riskier if they take the extra step of picking up the 404 page, which would probably have more information than the 403 page.)

LocationMatch would be nicer, since it deals with URLs rather than physical files, but you can't use it in htaccess.

A further quirk is that RewriteEngine On isn't inherited from the body of an htaccess into a <FilesMatch> envelope within the same htaccess. You have to say it all over again. If the apache docs mention this, it's in very small print; I only worked it out by trial and error. You would presumably have to repeat any [L] statements as well ("If the request is for anything in the /boilerplate/ directory, stop right here" -- that kind of thing).

I've got a lurking suspicion that it's only cost-effective if you can pack all your RewriteRules into one envelope or another.
7:58 am on Feb 19, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



:: further bump ::

Well, I took the plunge and added a <FilesMatch> envelope for images only:

<FilesMatch "\.(jpe?g|gif|png)$">


The contents of the envelope turned out to be evenly split between [F] and [L]. No redirects. I put the envelope in front of the general RewriteRules, though I'm pretty sure it doesn't matter, since it is read separately.

And now I have two head-scratchers.

#1 I was under the impression that <Files> applied only to real files that physically exist. -f kind of thing. And also that any rules inside an envelope are in addition to, not instead of, any rules outside the envelope. So I left a final trio of rewrites on the outside:

RewriteRule smallgifs/dot\d+\.png /pictures/smallgifs/onedot.png [L]
RewriteRule smallgifs/dot[\w-]+\.gif /pictures/smallgifs/onedot.gif [L]
RewriteRule smallgifs/dot7$ /pictures/smallgifs/onedot.gif [L]

This did not work. The rewrites didn't take place, and requests returned a 404 instead. To make the rewrites happen, I had to put them inside the FilesMatch envelope. Where they match against... uh... nothing, but nevertheless get rewritten as intended.

On the other hand, a group of image redirects that I left out of the envelope-- again because the requested files don't exist-- take place perfectly well. It's only the rewrite that plays dead.

#2 Continuing from #1: If I ask for

example.com/pictures/smallgifs/dot[\w-]+\.gif


using the "wrong" form of the domain name, the request doesn't get picked up by the canonicalization redirect. (Obviously this will only happen when I'm typing an URL straight into the address bar, but the RewriteRule doesn't know that.) Putting a duplicate of the canonicalization redirect into the middle of the envelope is not the solution; this results in a request for

www.example.com//home/lucy24/example.com/pictures/smallgifs/dot_j1.gif


-- the form you'd get if your rewrites and redirects were happening in the wrong order. (Note the two slashes. The first one is from the text of the rule; the second is from the capture.) Except that the rewrite obviously hasn't happened yet, or I'd be seeing "onedot.gif" at the end. In fact the rewrite aspect is a red herring, because the same thing happens if I request any image file with the wrong domain name. But everything comes through perfectly well if the request has the right name.

I am puzzled.


:: wandering off to deal with unrelated but equally worrying question of how g### has contrived to crawl a page which, to the best of my knowledge, has no links to it anywhere in the universe, and will not be crawl-ready for another week at least ::
 

Featured Threads

Hot Threads This Week

Hot Threads This Month