homepage Welcome to WebmasterWorld Guest from 54.226.0.225
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / Apache Web Server
Forum Library, Charter, Moderators: Ocean10000 & incrediBILL & phranque

Apache Web Server Forum

    
need help with regex rewrite
EvertVd




msg:4416303
 4:15 pm on Feb 10, 2012 (gmt 0)

I am creating a workflow for myself which includes an automagical way of invalidating caches on my css and js files.

I stink however at regular expressions.

The file being called has the format:

<filename>-rev<number>[.min].(js|css)

So it can be either a css or a js file and it can be minified or not.
I need a regex that rewrites this to:

<filename>[.min].(js|css)

But it should exclude things like jquery-1.7.min.js etc.

What I have so far is:
RewriteRule ^(.+)\.(rev\.+)\.(js|css)$ $1.$3 [L]
I get stuck on the optional .min
I should perhaps mention that <number> is a hexadecimal number

 

EvertVd




msg:4416310
 4:28 pm on Feb 10, 2012 (gmt 0)

After writing the above post I was wondering if this would be correct?:

RewriteRule ^(.+)-rev([a-fA-F0-9]+)\.(min\.)?(js|css)$ $1.$4 [L]
EvertVd




msg:4416431
 9:33 pm on Feb 10, 2012 (gmt 0)

I got a splitting headache, but I got this:

RewriteRule ^(.+)-rev([a-fA-F0-9]+)((\.min)?\.(js|css|png|jpg))$ $1$3 [L]

This seems to work, but I got it by trail and error, so if any expert can tell me if this is safe to use for my purpose I would be ever so grateful.

lucy24




msg:4416432
 9:36 pm on Feb 10, 2012 (gmt 0)

Overlapping your last post, so scroll back:

Oh, good, you got most of it sorted out. And you understand it a lot better than if someone had just written it for you. Don't you? ;)

The remaining problems are:

-- You start your pattern with .+ so the RegEx has to keep backtracking: "Oops! I was supposed to capture '-rev' back there."

-- You've got your full stops . correctly escaped but they are in the wrong places. The last part needs to be

(\.min)?\.(js|css)

If the rev can never be anything but a number, you don't need to include the a-z options. Just [0-9]+ or even \d+.

The trickiest part is the beginning. For this you will have to figure out all the possible forms a filename can have. Most importantly, how deeply nested can it be? That is, how many directories.

EvertVd




msg:4416457
 10:29 pm on Feb 10, 2012 (gmt 0)

Thanks Lucy,
Yes, I think the biggest problem is the first part since I have no control over that and basically the filename can be anything.
the rev-number is hexadecimal, hence the a-f.
For now this works, so i guess I'll have to field test it/

lucy24




msg:4416495
 3:41 am on Feb 11, 2012 (gmt 0)

the rev-number is hexadecimal, hence the a-f

You might be able to say \h (I haven't tested), but since it can never be anything but hexadecimal, you wouldn't gain much by it. And oops, yes, I did miss the a-f detail.

(.+)-rev([a-fA-F0-9]+)((\.min)?\.(js|css|png|jpg))$ $1$3


So you're only dumping the "revNNN" part-- which could be js or css or an image-- and keeping everything else? You don't need parentheses around the [a-fA-F0-9]+ since it's just one piece. In RegEx terms, it's all one letter :) That saves you having to keep track of captures. If you need to keep the ".min", then combining your last two captures is a good way to do it.

Request for
blahblah-rev01F.css

rewrite to
blahblah.css

Request for
blahblah/blahblah/blahblah-rev01F3A.min.js

rewrite to
blahblah/blahblah/blahblah.min.js

Is that what you're aiming for?

Do any of your filenames contain a hyphen? Domain name doesn't matter, just the filenames. If not, you've got it made, because all you have to say is

^([^-]+)

and that will make the capture stop nicely before the -rev.

EvertVd




msg:4416526
 9:20 am on Feb 11, 2012 (gmt 0)

Yes, that is exactly what I am aiming for.

I suspect my filenames will not contain a hyphen, but just to be sure I could perhaps change -rev to ~rev? because I am absolutely sure no filename will ever contain a ~.
So that would give me
^([^~]+)
correct?

As for combining my last 2 captures, I do not follow. I was under the impression I already combined them?
Thanks again for your help, and yes it does help me understand better trying to figure it out myself. I always try that, but sometimes just someone to "talk" to while doing it can be a big help. I guess twitter would be better suited for that, but I couldn't find an expert on there ;-)

lucy24




msg:4416564
 11:02 am on Feb 11, 2012 (gmt 0)

Oh, yes, if it's in your power to change the "rev" part then that would be a good way to do it. And ~ has no special meaning in RegEx so you don't need to pay any extra attention to it.

Conversely ~ never shows up in filenames except sometimes in user directories. (My mental association is with big sprawling academic domains, but I guess it occurs in other places.) We have to remember that it isn't just the filename itself. It's all the directories you pass along the way. But if you had any directories with leading ~ you would know about it ;)

As for combining my last 2 captures, I do not follow. I was under the impression I already combined them?

Yes, you did, so I should have said "It WAS a good idea."

EvertVd




msg:4416568
 11:11 am on Feb 11, 2012 (gmt 0)

Ok, I think I got it covered now.
Thanks again :-)

g1smd




msg:4416639
 6:28 pm on Feb 11, 2012 (gmt 0)

As every RewriteRule RegEx pattern is evaluated for every URL request, having a pattern that can be found to be a "match" or "not a match" as early as possible for non-matching requests is a good idea.

lucy24




msg:4416703
 10:20 pm on Feb 11, 2012 (gmt 0)

Do you think it would run faster if the Rule itself were constrained to extensions, and the "rev" part pushed into a Condition? Or possibly the other way around? Seems counter-intuitive to me. He's already got a conditionless Rule, which is generally the Gold Standard ;)

That's assuming the pattern really can occur in absolutely any directory. Otherwise there's more room for constraint.

g1smd




msg:4416705
 10:28 pm on Feb 11, 2012 (gmt 0)

It's fine as it is. If you pattern match for extensions, you still have to parse the entire URL left-to-right to get there. Doing more pattern matching in a separate RewriteCond will be even slower, but only for those matching requests.

EvertVd




msg:4416980
 2:33 pm on Feb 13, 2012 (gmt 0)

A final thought: Would it run faster or slower if I enclosed this rule inside a
<FilesMatch "\.(js|css|png|jpg)$">
condition?

lucy24




msg:4417152
 10:10 pm on Feb 13, 2012 (gmt 0)

I'll be darned. I didn't know you could use <Files... constructions for anything but core Allow / Deny directives. Tried it in The Other Site and did not get the expected 500 error, so I guess you can.

But since you're looking for the exact same thing either way, and you've already got a single conditionless Rule, I can't see how <Files... would speed things up. Besides, it scares the ### out of me ;)

g1smd




msg:4417155
 10:18 pm on Feb 13, 2012 (gmt 0)

[httpd.apache.org...]
and
[httpd.apache.org...]

EvertVd




msg:4417294
 9:09 am on Feb 14, 2012 (gmt 0)

LOL Lucy, I didn't mean to scare you ;-)
Thanks again, you too g1smd.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Apache Web Server
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved