Forum Moderators: phranque

Message Too Old, No Replies

Making sure .htaccess is ok

Configuring Expires and Cache-Control response headers

         

sftriman

6:26 pm on Dec 16, 2009 (gmt 0)

10+ Year Member



OK, I got Apache 2 installed, enabled ExpireActive, and
got everything set up. In my ~/public_html directory in
the .htaccess file, I added:

<IfModule mod_headers.c>
<FilesMatch "\.(ico¦gif¦jpg¦png)$">
Header set Cache-Control "max-age=2592000"
</FilesMatch>
<FilesMatch "\.(js¦css)$">
Header set Cache-Control "public,max-age=2592000"
</FilesMatch>
# <FilesMatch "\.(html¦htm¦txt)$">
# Header set Cache-Control "max-age=2700"
# </FilesMatch>
</IfModule>

I then use Page Speed from Google for Firefox / Firebug and
see that it's working - I think. I also used the Chrome
add-on HTTP Headers to view the headers. In this case, it
appears every time that the images are fetched and a 304
return status is issued.

My site is <snip>
Does this all look ok?

David

[edited by: jdMorgan at 6:30 pm (utc) on Dec. 16, 2009]
[edit reason] No URLs, please. See TOS and Charter. [/edit]

jdMorgan

6:39 pm on Dec 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not sure why you'd want to re-validate images but not pages, and that behavior does not seem to correspond with your cache-control settings... (?)

You should take advantage of regular expressions, and shorten your third <FileMatch> to

 # <FilesMatch "\.(htm[b]l?¦t[/b]xt)$"> 

You don't strictly need to specify the max-age in the Cache-Control headers, as mod_expires will insert it if you use directives like

ExpiresDefault A2700

within your <FilesMatch> sections.
This also explicitly adds the HTTP "Expires:" header to your server response.

Consider that 30 days may be too long for .css and .js caching -- What if you need to update those files? The work-around is to rename them, but that requires editing your pages to change the include links...

I can't comment further without knowing your intended caching policies for each filetype.

Jim

sftriman

6:42 pm on Dec 16, 2009 (gmt 0)

10+ Year Member



Correction: it's the HTTP Headers add-on for Firefox, not Chrome.
For Chrome, I was using Speed Tracer.

The problem I see in HTTP Headers, as I interpret it, is that
for a given image, there are 2 parts - the first is like a "before"
HTTP get section, and the next is the HTTP request section.
The before part says:

Cache-Control: max-age=0

and the next part says:

Cache-Control: max-age=2592000

and shows the 304 return code.

When I reload the page (not shift-refresh), it says the same thing.
But I would think the "before" part would have max-age=2592000
and I wouldn't see a 304 in my server logs; that is, the image would
be known to be cached in the browser and it wouldn't try to get it.

You can see this on my site using the same tools.

Any input would be appreciated!

David

sftriman

6:54 pm on Dec 16, 2009 (gmt 0)

10+ Year Member



Thanks for the reply, Jim. The third section is commented out -
I don't intend to cache any html or txt files. It's just images
and CSS and JS. I agree on the CSS and JS, and I had those at
one week initially before posting here. With that in mind, I just
put in explicit expire A times on each of the two sections.

For the moment, with the code as is, my two main questions are:

* why do I still see 304 return codes in the headers? (and of
course, see those requests in my access log)
* I got conflicting information on using public and private for
the CSS and JS files (or any other files for that matter);
Google Page Speed says to use private, but almost every example
of Cache-Control I've read searching on Google says to use public.
Which is better?

Thanks!
David

--------------------

I'm not sure why you'd want to re-validate images but not pages, and that behavior does not seem to correspond with your cache-control settings... (?)

You should take advantage of regular expressions, and shorten your third <FileMatch> to
# <FilesMatch "\.(html?�txt)$">

You don't strictly need to specify the max-age in the Cache-Control headers, as mod_expires will insert it if you use directives like
ExpiresDefault A2700
within your <FilesMatch> sections.
This also explicitly adds the HTTP "Expires:" header to your server response.

Consider that 30 days may be too long for .css and .js caching -- What if you need to update those files? The work-around is to rename them, but that requires editing your pages to change the include links...

I can't comment further without knowing your intended caching policies for each filetype.

Jim

jdMorgan

10:00 pm on Dec 16, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The 'before' part in the headers report is the request from Firefox to your server. The 'after' part is the response from your server.

Unless you're dealing with pages that change on every request, secure pages used for logging into accounts, or ErrorDocuments, then basically everything else should be cacheable. Images and media files the longest, because they're least likely to change, .css and .js in the middle range (not likely to change frequently on a mature site), and pages should be cached for shorter times -- a day or two max, in most cases.

"Public" means to allow caching in public (network) caches, like those used by AOL and Earthlink at the 'borders' of their networks. "Private" means browser and other end-client caches. If you change your content depending on the requesting user agent (e.g. IE versus everyone else), then the benefit of public caching is reduced. But you can still make use of it if you set a proper "Vary: User-agent" header.

Again, I recommend that you use the mod_expires ExpiresDefault directive to set max-age in the Cache-control header. For stuff you want kept current, set 'Cache-control: "no-cache, must-revalidate"'

Here's a working example:


# Set up Cache Control and Expires headers
ExpiresActive On
#
# Default - Set Cache-Control header to expire everything 1 day from last access, set must-revalidate
ExpiresDefault A86400
Header set Cache-Control: "must-revalidate"
#
# Images
<FilesMatch "\.(gif¦jpe?g¦png¦ico)$">
ExpiresDefault A1209600
</FilesMatch>
#
# Web beacon image
<FilesMatch "beacon\.gif$">
ExpiresDefault A0
Header set Cache-Control: "no-store"
</FilesMatch>
#
# Uncacheable files
<FilesMatch "(helpsearch¦localeinfo¦legal_info¦test¦401¦403[a-z]?¦404¦410¦500)\.html$">
ExpiresDefault A0
ErrorHeader set Cache-Control: "no-store"
</FilesMatch>
#
# Very-frequently-updated files
<FilesMatch "^(a-results¦b-results)\.html$">
ExpiresDefault A3600
Header set Cache-Control: "no-cache, must-revalidate"
</FilesMatch>
#
# Admin pages (paswword-protected)
<FilesMatch "^admin_sched\.html$">
Header set Cache-Control: "private, must-revalidate"
</FilesMatch>
#
# Frequently-updated pages
<FilesMatch "^(index¦tech_sched¦calendar¦cal[0-9]{4})\.html$">
ExpiresDefault A7200
Header set Cache-Control: "no-cache, must-revalidate"
</FilesMatch>

Note that "no-cache" doesn't do what it sounds like it does; In order to prevent caching, use "no-store." In fact, as far as I can tell, "no-cache" simple tells the client, "I'm serious when I say 'must-revalidate' after this."

Also note that later directives override earlier ones if they have the same or wider scope, and the design shown above relies on this to keep the code short.

Posting on this forum modifies the pipe character, making them show as "¦" broken pipes. If you use any of this code, be sure to change all the pipe characters back to solid ones...

Jim

sftriman

10:32 pm on Dec 16, 2009 (gmt 0)

10+ Year Member



The part I don't get is, if I told the requesting browser,
"Here's an image, and you can keep it in your cache for 30 days",
then when the browser reloads the page, why does it do the
must-validate thing and cause a 304 on my server? To really
reduce calls to my server, I'd like the browser to say, "Oh,
you want that image? Well, I've been told to not even bother
looking for it or validating it for 30 days, so I'll just
show it."

I can see the gotcha in that, if the browser didn't validate
the image and get the 304, and that image for whatever reason
was deleted on my server in the meantime since it was told to
be cached, then either a broken image results on the page, or
worse, maybe the browser, expecting an image, crashes or otherwise
renders a messed up page.

So maybe the answer here is that even with Cache-Control, browsers
are still going to validate the image and get the 304 code - that
step isn't eliminated. I'd like to think it's possible to get
rid of it, though.

Thanks for the clarification on private and public. I'd just as
soon make all my settings be private then, if it means that cache
occurs on the browser level.

David

jdMorgan

4:21 am on Dec 17, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I believe the answer is that the browser is re-fetching because you are not sending the Expires header. I do, and my images are never re-fetched unless the expiry time is exceeded, the user deletes his cache or forces a complete page reload, or if the cached copies of my images are over-written in his cache by newer cached objects from other sites -- which can happens if the user has "used up" his entire allocated cache space.

The working-example code above (which basically came straight off one of my servers except for being 'anonymized' for posting here), suitably modified for your site's filenames, should stop your problem.

Jim

sftriman

3:48 pm on Dec 17, 2009 (gmt 0)

10+ Year Member



Jim,

I tried an example very similar to what you have running.
I have a simple test page on which I keep creating new
copies of an image to test with. I'm still getting 304
messages. Using Live HTTP Headers on Firefox, I am seeing
that my image is marked with the Cache-Control must-revalidate
for the image I'm working with, whence it's causing a 304
to be generated.

This is what I have, with a 5 minute timeout for images so
that I could test 200, 304 and hopefully no logging.

<IfModule mod_expires.c>
ExpiresActive On
# Default - everything 1 day from last access, must-revalidate
ExpiresDefault A86400
Header set Cache-Control: "must-revalidate"
# Images, 5 minutes
<FilesMatch "\.(jpe?g¦gif¦png¦ico)$">
ExpiresDefault A300
</FilesMatch>
</IfModule>

I believe this is working, based on the headers I see.
The keep-alive is 5,99, and I see must-revalidate showing
up where I didn't see it before. Also, I removed the
Cache-Control directives in the .htaccess file in my
docroot, so the code above is the only code in play now.
For that reason, I don't see any Cache-Control info in
the headers for the image.

<snip>

Thanks.

David

[edited by: jdMorgan at 3:58 pm (utc) on Dec. 17, 2009]
[edit reason] No URLs, please. See Terms of Service and Forum Charter. [/edit]

jdMorgan

4:17 pm on Dec 17, 2009 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



As in my example above (with subsequent explanation), you must set "no-cache, must-revalidate" if you want to see 304s due to revalidation, or you should set only the expiry time (Expires header) if you don't want revalidation.

With the <FilesMatch> container order and scope that you have specified, the images are first marked with "must-revalidate" by the unconditional section at the top, and then their expiry time is set by the next section. However, the 'must-revalidate' header is not unset in that image section, so it's still going to be sent with your images.

Directive order is important. Container scope is important. Later code overrides earlier code if both sections have the same scope. Configurations set by earlier code but not modified, countermanded, or replaced by later code remain in force.

I suggest that you explicitly define the filetypes for which revalidation *is* desired, rather than trying to control when it is not. This is the approach taken in my example code.

If however, you wish to continue with your current construct, then you'll need to unset the Cache-Control header within the images <FilesMatch> container:


Header unset Cache-Control

Also note that <IfModule> containers should only be used if you want the contained code to fail silently if the module is not available. This is rarely what is desired if you control your own server(s), and I suggest that you remove or comment-out all <IfModule> containers in all current and future configuration code if this is the case.

Jim

sftriman

4:30 pm on Dec 17, 2009 (gmt 0)

10+ Year Member



I totally understand order of listings being important and scope.
In your example as listed in your earlier posting, I could see
that being the case, that the must-revalidate would carry on
through all the subsequent directives. But I figured there was
some reason your example was that way, and I was wondering why
there wasn't a Cache-Control define or redefine. Everything you
say makes perfect sense. I'll try it out now. Thanks for all
your help!

David