Forum Moderators: phranque

Message Too Old, No Replies

cache control in htaccess

correct syntax for cache control headers

         

gsdv

2:56 am on Jan 3, 2010 (gmt 0)

10+ Year Member



looking for help with correct syntax for cache control and how to properly separate and not cache https pages in a php cart...

# Set up Expires and cache control headers
ExpiresActive On
#
<FilesMatch "(401¦403¦404¦410¦500)\.html¦htm$">
ExpiresDefault A0
ErrorHeader set cache-control: "no-store"
</FilesMatch>
#
#<FilesMatch "\.(xml¦txt)$">
ExpiresDefault A28800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#
<FilesMatch "\.(ico¦pdf¦flv¦jpg¦jpeg¦png¦gif¦js¦css¦swf)$">
ExpiresDefault A604800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#

jdMorgan

8:31 pm on Jan 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You'll need separate cache-control code in the .htaccess file(s) in your HTTPs directory-structure (As usually implemented, HTTP and HTTPS files should be separated at least by directory).

Without knowing the details of your set-up, all I can do is recommend a couple of efficiency tweaks to your patterns. Use the power of regular-expressions to avoid testing "html" and "htm" separately, to avoid the fatal flaw in the logical structure of that pattern (any .htm file was matched regardless of filename, and so would have been marked as non-cacheable), and also put the filename and filetypes in order from most-likely-to-be-requested to least-likely-to-be-requested for efficiency:


<FilesMatch "(401¦403¦404¦410¦500)\.html¦htm$"> --> <FilesMatch "(401¦403¦404¦410¦500)\[b].(h[/b]tml¦ht[b]m)[/b]$"> --> <FilesMatch "(404¦403¦401¦410¦500)\.ht[b]ml?$[/b]">
#
# ...
#
<FilesMatch "\.(ico¦pdf¦flv¦jpg¦jpeg¦png¦gif¦js¦css¦swf)$">
--> <FilesMatch "\.(gif¦jp[b]e?g[/b]¦png¦css¦js¦ico¦pdf¦swf¦flv)$">

As always, replace all broken pipe "¦" characters you see in this forum with solid pipes before use; Posting on this forum modifies the pipe characters.

The suggested order of filenames/filetypes is just a guess based on a typical site's request profile; You may want to make further adjustments based on a quick check of your site's stats.

Jim

gsdv

5:47 am on Jan 4, 2010 (gmt 0)

10+ Year Member



hi jim,

thank your for the reply. the willingness to help others on this site is phenomenal. the inquiries are for a zen cart store setup on an apache server.

"Without knowing the details of your set-up..."

# Set up Expires and cache control headers
ExpiresActive On
#
<FilesMatch "(404¦403¦401¦410¦500)\.html?$">
ExpiresDefault A0
ErrorHeader set cache-control: "no-store"
</FilesMatch>
#
#<FilesMatch "\.(xml¦txt)$">
ExpiresDefault A28800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#
<FilesMatch "\.(gif¦jpe?g¦png¦css¦js¦ico¦pdf¦swf¦flv)$">
ExpiresDefault A604800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#

thank you.

Swanny007

5:58 am on Jan 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



As an aside, can this be used to set cache-control on specific filenames, or is it just for filetypes (extensions)?

gsdv

6:23 am on Jan 4, 2010 (gmt 0)

10+ Year Member



i think this would be correct for adding filenames
<FilesMatch "(filename¦filename¦filename)\.html$">

jdMorgan

7:01 am on Jan 4, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



gsdv,

If you want separate cache-control for SSL and non-SSL resources, then *some* method of telling them apart is required. This can be by filename or filetype if you are limited to just .htaccess in one single directory-path, or it can be based on two separate .htaccess files, one in the HTTP-accessible directory, and one in the HTTPS-accessible directory, in which case the directory is implicitly part of the condition under which the cache-control headers are constructed (because if the other directory is accessed, then the .htaccess code in this directory won't even run0. Other than implicit .htaccess-location-based qualification and the <Files> and <FilesMatch> containers, there are no other useful 'conditional' controls available at the .htaccess privilege level that can be used to qualify mod_headers and mod_expires directives for your current application.

If you have access to the server configuration, then you have a few more options.

The final option is to 'wrap' the entire site in a script, which can output the correct Cache-Control, Expires, Last-Modified, Content-Type, and other HTTP response headers, and then "include" the requested resource (e.g. the requested "file") and send its contents to the requesting client. This can be viewed as using a script to replace almost the entire Apache content-handling phase.

I don't know how your files are disposed in regard to separate SSL/non-SSL directories or what privilege levels you have, and so cannot provide any other suggestions.

---

swanny,

You can use any part of the filename with <FilesMatch>, including the name, file-extension, or both, but do be aware that <Files> and <FilesMatch> examine the filename only: They don't look at the URL, and they don't consider the directory path. They only consider the name of the file as it would appear in a directory-listing. So if you have two files which match the <FilesMatch> pattern, and these two files are in different directories, but both are in or below the directory in which the .htaccess file with the <FilesMatch> directive is located, then *both* files will be affected. If that's not desirable, then the code can be moved into .htaccess files in each of those two files' own directories, and different cache-control settings can then be applied separately to each of them.

Simply-put, part of the scope of .htaccess code execution is based on the directives in that .htaccess file, and part of it is based on the location of that .htaccess file.

Jim

gsdv

6:35 pm on Jan 4, 2010 (gmt 0)

10+ Year Member



hi jim,

is there any problem with just adding this cache control (below). and then waiting to see if it helps or hurts and then add more cache control directives in a modular fashion?
#
<FilesMatch "\.(gif¦jpe?g¦png¦css¦js¦ico¦pdf¦swf¦flv)$">
ExpiresDefault A604800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#
also what would be the equivalent syntax for adding the cache control above to the html headers instead of in the htaccess? for example using meta tags, or ?...

thank you

jdMorgan

4:17 pm on Jan 5, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Cache-control headers in the <head> section of HTML files are largely a waste of time, since browsers don't need them and network caches do not look at the content-body of HTTP responses, and so will never even 'see' the HTML.

Also, consider whether you really need to revalidate those media filetypes. Doing so means that the browser will send an If-Modified-Since request to your server every time, and that (in most cases) your server will (have to) respond with a 304-Not Modified status response. It's usually better to simply mark those filetypes with a Expires header and leave the Cache-Control header blank (unset). Then if you need to 'override' a previously-cached object on an emergency basis, it's usually far more efficient to simply re-name it on the server, thus avoiding any need for page-load-by-page-load revalidation of included objects that almost never change.

There's no problem with using your 'modular' approach. You'll see the most gain right from the start with the filetypes you've listed, since images are requested most often and can be large, and the others you've listed, although perhaps not requested as often, can be even larger (e.g. the .pdf and .flv files). The benefit for each object type can be approximated as the number of requests per unit time multiplied by the filesize, and that's the order you should list them in.

Jim

gsdv

4:08 am on Jan 6, 2010 (gmt 0)

10+ Year Member



Hi Jim,
Thank you. I just want to be careful not to make things worse as I am trying to make them better.
#
<FilesMatch "\.(gif¦jpe?g¦png¦css¦js¦ico¦pdf¦swf¦flv)$">
ExpiresDefault A604800
Header set cache-control: "no-cache, public, must-revalidate"
</FilesMatch>
#
Along that line... With the above code in the htaccess, if I update an image for example, and I use the same image name, will the browser still recognize that I changed the image and thus instruct the browser to download the new image? Or will the user still see the old image until time (A604800) expires?

Conversely, if I don't update any of those types of files, will the cache retain the files until the time (A604800) expires and then re-access the files?

Basically what I am trying to accomplish is to speed up the browsing experience by caching certain files that change somewhat infrequently, but not have to change the file name if I make any updates...

jdMorgan

7:06 am on Jan 6, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I strongly suggest that you drop "public" from your Cache-Control header, as it makes objects publicly cacheable even if they are otherwise marked as non-cacheable. See RFC2616 Section 14.9, etc.

It is unusual for images to change once published. Most are "pictures" or "paintings" or "photos" or "logos." But if you have images that change because they contain, for example, data rendered into visual form, and you do not wish to change the image name when the data has changed and a new image is rendered, then I suggest you segregate your images into two groups: those that don't change or whose filenames can be updated if the images change, and those which are dynamically-generated and whose filenames cannot change. Then put these two groups into different directories or directory-structures. You can then put two different cache-control policies into effect by putting two different .htaccess files at the top of these two directory branches.

For regular images, just send an Expires header, setting it to a week or two. For the images that change frequently, set the expires-after time to one-half of your average update frequency, and set no-cache and must-revalidate in the Cache-Control header.

That's all that is needed: a long-term Expires header with no Cache-Control header at all for unchanging images, and a shorter-term Expires header plus "no-cache, must-revalidate" for the images that you do expect to change.

If you dig into the specs (RFC2616), you'll find that "public" and "no-cache" don't mean what you might think they mean. The only way to make sense of all this is to read and understand the protocol specifications.

Jim

gsdv

7:20 am on Jan 7, 2010 (gmt 0)

10+ Year Member



hi jim,

thank you. thank seems like a very good solution. so i would ideally have to groups:

Infrequent htaccess file
#
<FilesMatch "\.(gif¦jpe?g¦png¦css¦js¦ico¦pdf¦swf¦flv)$">
ExpiresDefault A604800
</FilesMatch>
#

Frequent htaccess file
#
<FilesMatch "\.(gif¦jpe?g¦png¦css¦js¦ico¦pdf¦swf¦flv)$">
ExpiresDefault A259200
Header set cache-control: "no-cache, must-revalidate"
</FilesMatch>
#