Forum Moderators: phranque

Message Too Old, No Replies

htaccess rewrite for file versioning

         

mihomes

12:12 am on May 6, 2015 (gmt 0)

10+ Year Member



I have this working, but would like some input from the regex gurus before putting anything into real use :) I want to implement versioning on my js and css files to ensure that users will always retrieve the latest file (even with my cache settings). I have read up on this and using query (?v=558353489) could possibly cause issues with proxies. For that reason, I am placing the versioning in the actual file name using a delimiter (testver) followed by an md5 hash of the file.

I want to apply this to all .css and .js files which reside anywhere within the test.com/custom/ folder.

http:www.test.com/custom/*anything*.testver*hash*(.min?).css or .js

The goal is to remove .testver*hash* as this would be the real filename.

examples of what I am talking about :


http://www.test.com/custom/styles.testver34frt564.css <--> http://www.test.com/custom/styles.css
http://www.test.com/custom/styles.testver34frt564.min.css <--> http://www.test.com/custom/styles.min.css
http://www.test.com/custom/styles.1.4.5.testver34frt564.css <--> http://www.test.com/custom/styles.1.4.5.css
http://www.test.com/custom//folder/another/styles.1.4.5.testver34frt564.css <--> http://www.test.com/custom/folder/another/styles.1.4.5.css


What I came up with is :

#BEGIN versioning for js and css files (only the custom folder)
RewriteRule ^(custom)/(.*)\.(testver.*)\.((min\.)?js|css)$ $1/$2.$4 [L]


I'm almost positive this can be rewritten in a cleaner and more efficient way, but I have always been horrible with these.

lucy24

3:56 am on May 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



^(custom)/(.*)\.(testver.*)\.

Ahem, cough-cough, hmph.

Surely you've got some idea what will be occupying each of those .* slots? \w+ or [a-z]+ or [^.]+ or \d+\.\d+\.\d+ or, well, something. Is there a potential for nested subdirectories? If so, it's some permutation of ([^/]+/). Let's figure out what each of those two .* stands for, and write the rule accordingly.

Is the element .min really intended to only be an option with scripts? So it's .js OR .min.js OR .css but never .min.css? That's how the pattern is written, but it's not what your examples seem to imply. At the moment, however, there's a bigger problem. If "min." is present, it will already have been taken as part of the .* that comes after "testver"; it will never be taken as part of $4. This is bad news if it's supposed to be retained, while the previous bit is discarded.

Incidentally, there's no point in capturing the literal text "custom". Just repeat it on both sides, coming and going. Oh, and make sure the target starts in a / slash.

Similarly there's no point in capturing $3 (testver blahblah) since the whole point of the rule is to discard this part. No point in capturing something that's not going to be reused.

One possibility-- there are lots of others-- is:
RewriteRule ^custom/([^.]+\.)testver[^.]*\.(min\.)?(js|css)$ /custom/$2$3$4 [L]

Each captured bit ends in a . for tidiness; in "min." of course it's essential.

mihomes

4:05 am on May 6, 2015 (gmt 0)

10+ Year Member



Actually some more testing and this does not work... it is not correctly handling the optional .min before .js|css

Edit : That was in regards to my own post. I did not see yours Lucy and will look now.

mihomes

4:15 am on May 6, 2015 (gmt 0)

10+ Year Member



Yes, they could be nested in folders within /custom/. They can end in either .css, .js, .min.css, or .min.js. Immediately preceding this ending would be my 'version' of 'testver + an md5 hash of the file'.

The first .* slot in the code I posted was meant to take care of any nested folders within /custom/ as well as the first part of the filename. Some of these are going to be vendor plugins and what not so a filename could be something like 'plugin.1.8.1.js' which I would then change to something like 'plugin.1.8.1.testver543fgte6.js'. Hopefully that makes sense. I would like to keep that versioning of course so I know what version of a file is being used. My own versioning is simply to ensure the latest file is always received by the user regardless of caching.

mihomes

5:12 am on May 6, 2015 (gmt 0)

10+ Year Member



Lucy, that did not work for me, but it pointed me in the right direction. I found a site to test rewrite rules that shows the output which helped tremendously.

I came up with :

RewriteRule ^custom/(.*\.)testver[a-z0-9]*\.(min\.)?(js|css)$ /custom/$1$2$3 [L]


which handles all nested folder within /custom/ as well as filename versioning such as /custom/plugin.1.8.3.testver345fr5454.js. The hash is always going to be alphanumeric as well.

(.*\.) meant to be 'anything (including a period) which ends in a period before 'testver.

lucy24

5:48 am on May 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



OK, good, the [a-z0-9] version will prevent unwanted capture of "min.". Personally I'd save a few keystrokes by saying \w instead, but that's a matter of individual style. So now we're down to one .* If it were nothing but directories, you could express it as
((?:[^/]+/)*)
with non-capture markup ?: on the inner group so you don't have to keep count of which numbers to skip.

But if there really could be absolutely anything before "testver" in the filename, then this is the rare case where you have no alternative. It means that the server will start by capturing the whole request all the way to the end, and then back off and say "oh, oops, I was supposed to leave room for 'testver'". But at least this backing-off now only happens once.

In a different context, I'd say .*? for .* but here I'm not sure it would make any difference to performance. Formally .*? means "as few as possible". In practice I think it means "If there turn out to be multiple occurrences of 'testver' then stop before the first one" and that's not a factor here.

Eeuw. I think I've set a record for number of "but"s in a single post. But (haha) it was worth it, because the browser had temporarily eaten my insertion point (very disorienting!) and now all this extra typing has caused it to reappear at last. Whew.

mihomes

7:08 am on May 6, 2015 (gmt 0)

10+ Year Member



Played around with this more... I probably should have thought of how I was going to implement this in my pages first haha. I quickly realized the whole testverhash 'inside' the filename would cause a problem as I want this to be automated.

So... in a nutshell here is what I am thinking now :

function autoVer($url){
$ver = '.'.filemtime($_SERVER['DOCUMENT_ROOT'].$url);
echo $url.$ver;
}

<link href="<?php autoVer('/custom/css/login.min.css'); ?>" rel="stylesheet" type="text/css"/>


Which in the source would be displayed as :

<link type="text/css" rel="stylesheet" href="/custom/css/login.min.css.1430894107">


So basically I am appending a timestamp of the last modification time (upload time in my case). In turn I am using the following for the rewrite :

RewriteRule ^custom/(.*\.(js|css))\.[0-9]+$ /custom/$1 [L]


With the changes the only 'requirements' would be this is a file in /custom/ (including folders if there are) where the file ends in .js or .css and ends with a .timestamp. This cleans things up a bit and is much easier. I am still using (.*) here, but seeing as how this could include additional folders plus the actual filename of the css/js I don't know if there is anything else I could use?

lucy24

7:18 pm on May 6, 2015 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Can the filename itself potentially include strings like "blahblah.1.2" as suggested earlier? If yes, then you really are stuck. If you could be certain that there will never be any earlier . (literal dots) in the filename, then you could replace .* with [^.]+ to eliminate backtracking.

In actual performance there's no difference between * and +. But I prefer to use + when the one thing you can be sure of is that there will be something here. That is, you'll never have
/custom/.css
and-that's-all.

Is there any possibility whatsoever that the sequence ".js" or ".css" (including the dot) could occur anywhere else in the filename? If no, you don't need the [0-9]+$ part at all. Just end the pattern with
^custom/(.*\.(js|css))\.
What's after the dot doesn't matter so the server doesn't even need to consider it.

Have you simply ditched the "min." part? Within the capture, that is: if it were just a choice between login.blahblah and login.min.blahblah it might still be
^custom/([^.]+\.(min\.)?(js|css))\.

I guess technically "(js|css)" is "(j|cs)s" but let's not go overboard ;)