Forum Moderators: phranque

Message Too Old, No Replies

Question about modification time on mod expires

         

sbraaa

10:56 am on Dec 5, 2011 (gmt 0)

10+ Year Member



I've enable mod_expires to cache some resources on a web application write by me (Apache + Linux/Win + php + postgreSQL).
My problem is that an access time based logic doesn't run well with my application. I need to make modification anytime and I cannot refactor the entire framework to add any fingerprint tecnique nor use any rewrite logic. :-(
So, I thought I could use a modification time based logic: when browser ask for a single resource an header will be return with last modification and future expiration time. So I could set modification time far in the future and modify modification time of the resource (using touch for example) to refresh the cached copy of a single resource.
My question is: if I made a modification of the resource BEFORE last-modified expiration time has been expired should the browser request the new version of the resource and refresh the cached copy of that resource?
Thank you all in advance

lucy24

10:25 pm on Dec 5, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



if I made a modification of the resource BEFORE last-modified expiration time has been expired should the browser request the new version of the resource and refresh the cached copy of that resource?

First answer: Wouldn't that kinda defeat the point of setting an expiration time at all?

Longer answer: Better come back and explain the word "should". Really. I'm not sure whether you mean "Is this the expected behavior from a compliant browser?" or "Is this the expected behavior from my current mod_expires settings?" or possibly even "How do I change my settings to bring about this behavior?" and/or "...to prevent this behavior?"

sbraaa

9:22 am on Dec 6, 2011 (gmt 0)

10+ Year Member



Sorry, I'll try to explain better!
Forget my actual mod_expire settings! They're actually based on access time e they doesn't fit my needs.
My requirements are: caching some resources (images, js files, etc. etc.) and refresh them on change without knowing at all when the changes will be!
Access time logic doesn't fit my needs because the name of my resources never change and I cannot introduce any fingerprint tecnique (as I wrote before).

So, what I really need, is finding some settings (perhaps modification time logic) that meet my needs.

I think modification time could be the right way but I cannot understand completely how modification time logic works on mod_expire so I cannot predict if it is the right solution for my needs.

My final question is: if I manually change modification time of a resource that was cached using modification time logic and that is not expired, the browser 'see' the change (notify by header) or simply wait for expiring of the resource? Is this true? Is this the expected behaviour valid for any browser?

lucy24

1:19 pm on Dec 6, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



:: shuffling papers ::

Apache [httpd.apache.org] says ominously
When the Expires header is already part of the response generated by the server, for example when generated by a CGI script or proxied from an origin server, this module does not change or add an Expires or Cache-Control header.

I think this means you could beat your brains out for weeks and never get it to work the way you want, if there is other stuff going on that's beyond your control.

What do you mean by "manually change modification time"? Changing the variable part, like from 1 week to 2 days? Or re-uploading an unchanged file so it looks as if it has been changed?

As I understand it, the browser always looks at the header. So if someone visited yesterday and met a header that said "good for another five days", their browser will still check the header tomorrow; they won't just ignore it for those five days. And of course the user is always free to refresh manually-- and if they empty their cache, it doesn't matter if you just modified the file three minutes ago. It will get loaded up again. Conversely, if the ISP uses remote caching, they may or may not pay attention to what your header says.

Note that if you use a modification date based setting, the Expires header will not be added to content that does not come from a file on disk. This is due to the fact that there is no modification time for such content.

Well, you don't have any dynamically created images do you? ;) Modification sure seems like what you're looking for.

ExpiresByType image/gif "modification plus 5 hours 3 minutes"

(says Apache, but they're just showing off)

If cached, the document may be fetched from the cache rather than from the source until this time has passed. After that, the cache copy is considered "expired" and invalid, and a new copy must be obtained from the source.

Sorry, but I just love those "mays" and "musts". On one hand they're giving permission; on the other hand they're saying you gotta... or else. Uhm. Or else what? :)

sbraaa

2:08 pm on Dec 6, 2011 (gmt 0)

10+ Year Member



Thank for your quick answer! :-)
To be honest I think mod_expire's documentation seems to be not so exaustive! Anyway...

What I really need is the same results I will obtain using access time based logic and proper fingerprint tecnique on my resources. In this case, make some changes on a resource (upload a new version of the file), will refresh browser cache because using fingerprint tecnique will change the filename.
Unfortunately, I can't do this, because my framework doesn't manage any fingerprint tecnique.
So even if I make some changes on a particular resource (ex. js file) and upload it again, browser cache will not be revalidate because the name is the same. So the user will not see any changes unless he force browser cache refresh or cache expired in the meanwhile.

I was beating my brains out to find valid solutions to this problem and I thought that modification time logic could be the right way.

Suppose that I upload a js file with this timedate :
2011-12-06 09:00:00
and set mod_expire rule for js files to "modification plus 2 hours".
As we know, if I access a js file from my webapp, my browser will keep it in cache until 2011-12-06 11:00:00 then it will discard the copy in cache and download a new fresh copy of the file.
Now suppose that at 10:00:00 I'll make some changes and upload again that js file (obviously with the same name).. what happen then?

The cache copy of the file will be discarded? I guess that if header said that last modification time is 10:00:00 this could be... am I wrong?
Is it possible or should I stop drinking at night? ;-) :D

lucy24

12:12 am on Dec 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I think it would be useful to try it in your individual setup. Trial-and-error is not always safe, but this is not a situation where getting it wrong will result in a 500 error and potential server meltdown ;)

lucy24

11:21 pm on Dec 7, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Returning...

I went off and read up on caching because unlike most Apache issues, it doesn't all take place in the server. This in turn means that some of it is out of your control and you can only hope that people will do what you want. And that's part of the problem right there: "people" means two entirely different things. There's the individual browser ("private" caching) which will probably do what you want, and assorted ISPs and other proxies ("public" caching) which may or may not do what either you or the end user wants.

For starters I had to confirm one hunch, because making things up as you go along can only take you so far. Cached files that have passed their expiration date do not disappear. Look in the back of your refrigerator for the obvious analogy ;)

w3c goes on at great length about the assorted warnings that browsers are supposed to give. They're even more incomprehensible than Apache docs:

[w3.org...]

That's from 1999 but I think the general principles still apply. I found a second source that's in baby talk by comparison:

[mnot.net...]

(Sorry, mods, don't know if that's an Authorized Link. Possibly not, since I understood it.)

From all this I gather that my browsers-- possibly even most browsers-- tend to do the exact opposite of what browsers are supposed to do. In particular, if you request something from your browser history, it's supposed to pull the files out of your cache like your own little Wayback Machine, not go looking for the site all over again. News to me.

sbraaa

9:40 am on Dec 9, 2011 (gmt 0)

10+ Year Member



First of all thank you for your patience ;-)
I already known the links you posted, unfortunately, they're all generic so I think the right way is try by myself ;-)
Thank you

g1smd

9:45 am on Dec 9, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Users can set the size of their browser cache and older data will be purged automatically.

If you tell the browser not to look for a new copy of the page for 2 weeks, the browser will not look for 2 weeks unless the local copy has already been discarded.

lucy24

2:20 am on Dec 10, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Be sure to post back if you learn something useful. Or even if you learn something useless, so others don't waste time trying things that don't work.

I've got a nasty feeling that remote caching-- what the docs call "public" caching-- will be an insoluble problem. That's based on the behavior of one local IP-- not mine, luckily-- that goes around caching things that should never, ever be cached. Not just "no-cache" or "expires yesterday" types of things; it was caching site logons with all the user information filled in, and feeding it to the next person who went to the same site. Not Nice.