Forum Moderators: phranque

Message Too Old, No Replies

Forcing browser cache update

apache, .htaccess

         

smallcompany

7:23 pm on Jan 15, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Based on the findings when I used Google's page speed tool, I implemented this in order to resolve one of the obstacles:

### CACHE
# set up expires and Cache Control headers
expiresActive On
# Cache images for up to six months -- no forced revalidation
<FilesMatch "\.(gif¦jpe?g¦png¦bmp¦ico)$">
expiresDefault A15552000
</FilesMatch>
# Cache files for up 30 days -- no forced revalidation
<FilesMatch "\.(css¦html¦js)$">
expiresDefault A3888000
</FilesMatch>
#

Now, if I implement changes that I want to be sure are reflected on all visitors' computers, what would be the change in the code so they all update their cache?

Thanks

jdMorgan

12:41 am on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's too late for that. The soonest that all caches can be updated for image files is in six months. You've told the browsers not to check for updates for six months, and many of them won't do it. So the only way you can force an update of any given content is to change the URL of that content by changing your links to it (if it's a page) or changing the object include reference, e.g. the <img src=> tags.

There is little benefit to caching *anything* much beyond a few weeks, and a serious down-side -- as illustrated here.

I'd suggest you cache your pages for three days maximum, images for ten days to two weeks, and then if you plan to make any changes, reduce these cache times *before* you start working on the project. Basically, you must plan ahead for the period of time that you've told the clients, "keep what you've cached, and don't come back to my server."

Jim

smallcompany

1:58 am on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Thanks. So once you tell the browser 30 days, no way of forcing it to update?

What about:

Header set Cache-Control: "no-cache, must-revalidate"

How that revalidation works?

Is there any way of forcing those that already picked long time settings?

P.S.
It was a hint initiated from Google's page speed tool to put longer time in cache settings.
I feel so stupid for worrying more about what will Google say (being that organic or PPC, like QS), then about actual visitors.

jdMorgan

3:57 am on Jan 16, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



> I feel so stupid for worrying more about what Google says ... than about actual visitors.

Well, don't feel stupid. Caching isn't an easy concept to get used to, even with clear and simple examples...

It's a very good thing to send Expires and Cache-Control headers, but you need to be sure you know their effects, and that you set "reasonable" cache-expires times... And the definition of "reasonable" depends very much on your specific site, your pages, your visitors' typical behavior, etc.

Start with shorter caching times, and increase them only after your site is mature, you no longer need to make frequent or 'big' changes, and you are used to the "planning ahead" that cache management requires.

I'd say go a week at the maximum right now, because if you have to wait that long, it's acceptable. You can easily plan ahead that far, and it might be a little painful if you have to wait a week, but it won't be 'fatal.' However, having to wait a month is usually too long on today's fast-changing Web. And again, the payoff for a little caching is huge, while the further gains beyond that get smaller and smaller.

Let's say 90% of your visitors every day are new, while ten percent are regular visitors, and that all visitors typically load an average of three pages with seven 'common' images, one 'common' stylesheet, and two 'common' JavaScript files during one session over the course of ten minutes to one hour, and that the regular visitors re-visit every day.

If you cache for one hour, you may cut your server load in half (50%) of the original load. Cache for two hours, and you'll cut it to one-third (33%). Cache for four hours, you'll cut the load to one quarter (25%). Cache for eight hours, and you cut the load to one-fifth (20%)...

Do you see where this is going? Each doubling of cache time *does not* reduce the server load as much as the previous doubling. That's why there's little use to using cache times longer than a few weeks... doubling that to one month brings almost *zero* gain.

The above is just a loose example: The effectiveness of caching depends heavily on your user behavior, and the mix of 'new visitors' to daily or even hourly visitors. It also depends on how fast your objects change, and how much those objects are 'shared' among several pages. So on some sites, caching for a week makes sense, while on others, caching for any more than a few hours gives very little additional benefit.

> How does that revalidation work?

If you set 'Cache-Control: "no-cache, must-revalidate"' and a short Expires time, then the browser will check with the server every time it wants to re-display a cached object, using an "If-Modified-Since" request header to send the last timestamp that your server originally sent with that cached object in its "Last-Modified" response header. If the object has been updated since the timestamp sent by the client, then the server will send the new object and a new Last-Modified timestamp header. If not, it will respond with only a "304-Not Modified" response header.

So, the advantage of revalidation is that you still save some bandwidth with little risk of the client showing "stale" objects, but the disadvantage is that the client must wait while your server checks the client's If-Modified-Since header against the file's "Last-Modified" timestamp and of course, the server has to actually go check the filesystem to get that "Last-Modified" time. So all that's saved is the actual content transfer bandwidth and transfer time.

---

However, changing your cache-control headers now won't affect those clients which have already cached the content, so it's still either change the URLs or wait six weeks... :(

---

Here's what works for me on one of my older 'standard' sites:

  • I Expire most images, video, Excel, Word, and PDF documents after two weeks, with no Cache-Control header at all -- just "Expires A1296000"
  • I Expire CSS and JavaScript files after eight hours (and update them only after midnight on 'regional' sites, or only after having reduced the expiry time the day before I want to make a change (and reducing progressively more during the day while I'm working on the change) on 'global' sites).
  • For very-frequently-updated pages, I Expire after one hour, and use 'Cache-Control: "no-cache, must-revalidate"'
  • I mark all ErrorDocuments and "tracking images" as uncacheable with 'Cache-Control: "no-store"'

    That's what works for me. It might work for you, or it might be totally unsuitable -- Everything depends on the nature of your site and its visitors...

    Jim

  • smallcompany

    6:03 pm on Jan 16, 2010 (gmt 0)

    WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



    Thanks very much for your thorough reply Jim.

    I just cannot but not to comment onto two things:

    1. Cannot understand why there is no "rescue command" that would force browsers not to wait until the expiration time, but that only URL change would force it.
    2. I tested Google's page speed yesterday and found it was warning me about "Leverage browser caching" until I set it to 45 days.

    It is obvious that nothing can be taken as it was suggested, but only after extensive reading.

    At least now I know that I can combine the Header set Cache-Control: "no-cache, must-revalidate" and change the timestamp if I want to revalidate. That helps.

    Thanks