Forum Moderators: phranque
<-- Using "caching", we can reduce the load on the server and allow faster page loading times on our dynamic pages.
Caching is work like this:- As Web cache sits between Web servers and a client or many clients, and watches requests for HTML pages, images and files come by, saving a copy for itself. Then, if there is another request for the same object, it will use the copy that it has, instead of asking the original server for it again.
Downsides are :
- There is a possibility that you could retrieve an old copy of a document which has subsequently been updated. When the cache stores a document it makes an estimate of how long the original is likely to remain unchanged. During this period the cache will issue its copy of the document without reference to the originating site. If the original does in fact change, the cache may return a copy which is out of date.With most browsers the user may use the 'reload' command to force the cache to retrieve the original again, but this eliminates much of the value of caching.
- Reduced performance for some requests - Requests that are not found either in the local or national cache are then retrieved by the national cache and returned to the browser. There is a slight overhead in processing requests via the cache and performance will be somewhat slower than if the page had been returned directly. However, the performance of the servers and network is such that this difference is barely perceptible under normal circumstances. -->
Can anyone help make this clearer and maybe provide a more "user-friendly" explanation of the pros and cons? Search engine ranking and spidering is not something most professional scripters are familiar with so this hasn't been taken into consideration when making this suggestion to me.
Thanks
Mike
A "cache" is a store of saved items. Arctic explorers would leave well-marked caches of food and supplies on their way into remote regions, so they could access those supplies for use on their return trip, yet not have to carry those supplies all the way "there and back again."
The idea of Web caching is based on the same idea. Make Web resources easily- and locally-accessible, without having to re-request them "all the way" from the orignal server. Caching speeds up the user experience and can greatly reduce the load on a dynamically-generated site.
Browsers keep a local copy of requested pages, and serve those copies if the page is re-requested before the cache entry is marked as 'expired' -- that is to say, outdated. You can control this expiry time. (For Internet Explorer, you can view your cache by examining the files in your "Temporary Internet Files" directory.)
Some ISPs and corporations also use caching proxies -- AOL for example. Copies of Web resources requested from inside their network are kept in local caches, so they need not be re-requested from the Web if another local network user requests the same item. Again, this reduces bandwidth into and out of their network, and can speed up the user experience. Caching reduces not only the users' bandwidth, but also reduces internet bandwidth as a whole, improving the user experience for *everyone*.
Pages that that change in important ways every time they are requested should not be cached, while static pages, logo images, included scripts and CSS files, and other resources can be marked as cacheable for hours, weeks, or even months, as appropriate.
See this Web caching tutorial [mnot.net].
Jim
If every visitor is pulling up a different dynamic page (there are over 1 million pages on my site) each time they visit then caching is of no real use/resource saving to me?
Have I understood that correctly?
Thanks
Dynamic pages where the content changes based on time, session cookie values, browser type, visitor domain or IP address, referrer, or anything else are usually non-cacheable, and must be marked as such by returning proper cache-control headers with the server response. Use a server headers checker to verify.
Cacheability factors should be considered as part of the overall site design. Changing just a few popular pages so that they are cacheable can dramatically reduce server load. It helps to consider, "How often does this page change now?" and "How often does this page really need to change?"
There are two levels to this as well. The first --and pre-existing-- level is the user's browser cache. It keeps a copy of pages unless they are marked as non-cacheable, until such time as the stated expiry time, until replaced by something newer (if the cache is full), or until a default expiry time.
The second level is network caches -- caching proxies that sit between your server and your visitor. Corporations and ISPs commonly use them.
The third level is your server. Many sites are set up so that the page generation script creates a static file as its output and serves that. It only runs and creates a new static page if one does not exist. Then, a cron job is used once a day or once an hour to delete the static copy, forcing the script to re-generate the page on the next request. This can dramatically reduce CPU load, and if the static pages are saved in a compressed format, it can further reduce bandwidth utilization.
It's a fairly complicated and deep subject, and I suggest spending some research time for those interested.
Jim