Welcome to WebmasterWorld Guest from

Forum Moderators: Ocean10000 & incrediBILL & phranque

Message Too Old, No Replies

High Performance Cache with mod-rewrite?

How to deliver dynamic content statically, plain or compressed if possible

12:33 am on Feb 19, 2004 (gmt 0)

New User

10+ Year Member

joined:Feb 18, 2004
votes: 0

Hello everyone,

I'm looking for a high performance solution to deliver cached pages, compressed if possible, uncompressed otherwise, or dynamically generated content whenever content has changed.

Right now, all content of my high traffic bulletin board is parsed through a forum.php script, which looks at the $_SERVER['HTTP_ACCEPT_ENCODING'] variable to determine whether to send a plain .html file or a zipped one if the browser can handle it. Then, the script looks in a cache directory to find out, whether a cached version is already present. If it is, fine, deliver it. If not, the appropriate script is called to deliver the frontpage, a forums content or a thread. Besides delivering the page, each script also stores a plain and a zipped version in the cache directory.

Whenever a page needs an update, the script simply deletes the corresponding file in the cache directory and it will be recalculated the next time it is requested.

So far so good, or not?

I think, calling a php script for requesting a cached file is not the perfect solution, even more so, as I have to use php as a cgi, not an apache module. My investigation in mod-rewrite led me to do something like this in my /forum directory:

    RewriteEngine On
    RewriteBase /forum
    RewriteRule ^(.+)/$ mycache/forum/$1.html [T=text/html]
    RewriteRule ^$ mycache/forum.html [T=xhtml]


  • /forum/ will be fetched from /mycache/forum.html,
  • /forum/topic1/ from /mycache/forum/topic1.html,
  • /forum/topic1/123/ from /mycache/forum/topic1/123.html, etc.
    (yes, all my files look like directories from the outside).

If the requested file in the mycache directory doesn't exist, a 404 error needs to be avoided by placing another .htaccess file in the mycache directory to handle this and call my forum.php script.

Now my questions:

  1. Is mod_rewrite able to act according to the browsers ability to handle compressed pages? (If not, I can think of a cookie delivering this information from the second page on the user asks for).

  2. How should the .htaccess file in the mycache directory redirect to my forum.php script without the disturbing 404 header?

  3. If question 2 can be handled, what about the performance of this solution compared with something like:

    RewriteCond %{REQUEST_URI} ^/(.+)/$
    RewriteCond path/to/root/forum/mycache/%1.html -f
    RewriteRule ^.*$ mycache/%1.html

The first solution is imho the fastest way to an existing cached version posing the need for an ugly 404-handler in case it doen't exist. The second way avoids the 404-thing, but I guess the price tag is a loss of performance, as each file needs to be looked up in the mycache directory first.

Both, cpu load/RAM limitations as well as bandwidth concerns apply in my case, so I look for a solution to handle plain and compressed cache without unneccessary calling of scripts. Unfortunately, I don't have the option to change the configuration of my apache server, at least it understands mod_rewrite.

What should I do? Any idea is highly appreciated.

3:07 pm on Feb 19, 2004 (gmt 0)

Senior Member

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Mar 31, 2002
votes: 0


Welcome to WebmasterWorld [webmasterworld.com]!

I've never tried what you're doing, but I'll offer a couple of comments.

You may be able to check the browser capabilities with something like:

 RewriteCond %{HTTP_ACCEPT_ENCODING} gzip 

You can avoid the need for the URL-to-filename conversion rule in your question 3 example by using

 RewriteCond %{REQUEST_FILENAME} -f 
instead of the
 RewriteCond %{REQUEST_URI} 
construct shown.

The "-f" check for file exists should not be too much of a problem. After all, the server itself has to check this in order to decide whether to return a 404 or serve the file. It's likely you'll only have performance problems if you use "-F" or "-U", which invoke an internal subrequest and are therefore relatively slow. Set up your rules such that they will match only cacheable subdirectories and filetypes for maximum efficiency.

In addition, you could invoke a script to create new cached version if the -f fails. Actually, you'd probably want to go ahead and serve the plain version, but "trigger" a script to create a new cached version of the file that was requested. All kinds of options here, all based on performance-tuning...

I'd be very interested in a summary of what you learn while implementing this project -- and others may be, too; We've had some general discussion of using gzip recently, so maybe the ideas of caching and compressing are finally gaining momentum.


5:55 pm on Feb 19, 2004 (gmt 0)

New User

10+ Year Member

joined:Feb 18, 2004
votes: 0

Thanks Jim,

you helped me already very much. I read about a slowdown by using

RewriteCond %{REQUEST_FILENAME} -f
, but I think I was missguided and confused with "-U", which you mentioned as slowing down by invoking an internal subrequest.

The reason why I use:

    RewriteCond %{REQUEST_URI} ^(.+)/$
    RewriteCond path/to/root/forum/mycache%1.html -f

instead of:

    RewriteCond %{REQUEST_FILENAME} -f

is the following: My

ends with a slash, all my URLs look like directories for historical reasons. E.g. I have three URLs to cache:


I planned to store them as


instead of


to avoid unneccessary creation of subdirectories. Therefore, the slash at the end needs to be cut off before adding .html. Is there another, sleeker way to do that?

I tried

RewriteCond %{HTTP_ACCEPT_ENCODING} gzip
, but it didn't work. On page 12/13 of the Apache documentation (mod_rewrite.pdf) there is a list of the available server variables:

where NAME_OF_VARIABLE can be a string taken from the following list:

HTTP headers:


    connection & request:


      server internals:


        system stuff:




I think,

will do the job, not on the first page a user requests (having no cookie), but on every subsequent one after I've set a cookie according to the browsers ability.

After searching a little bit more, I stumbled over the apache docs about mod_rewrite with the following subheader: On-the-fly Content-Regeneration ([engelschall.com ]). They call it esoteric :-) but it is pretty much the same idea.

so maybe the ideas of caching and compressing are finally gaining momentum.

You say it. My website is mainly a bbs with 20 posts per page and 5 mio page impressions per month with a bandwidth of 50 gigs. This means about 10 kbyte per page including graphics and everything.

Sounds impossible? No, it's not.

I use graphics sparely, external cascading stylesheets extensively (cachable), no table constructs and other waste of code, all navigational things stuffed into an external javascript (cachable), gzipped whenever possible, and everything works like a charm... except... I have a little too much RAM consumption on a 256 mbyte RAM linux server causing "CGI limits reached" repeatedly during the rushhour.

I'll keep you updated on my experience.


allows testing for graphic formats and e.g. whether a client is accepting shockwave, but not for gzip support.
11:32 pm on Feb 19, 2004 (gmt 0)

New User

10+ Year Member

joined:Feb 18, 2004
votes: 0

Testing for the existance of a cached version and calling a script if not found works smoothly right now using:

    RewriteEngine On
    RewriteBase /forum
    RewriteCond %{REQUEST_URI} ^(.*)/$
    RewriteCond path/to/root/forum/mycache%1.html -f
    RewriteRule ^.*$ mycache%1.html

    RewriteRule ^([^/]+)/$ forumlist.php?myforum=$1
    RewriteRule ^([^/]+)/(\d+)/$ thread.php?myforum=$1&mythread=$2

    DirectoryIndex index.php

The next problem now is: How do I do the same with a zipped version? To create a zipped file with php I do:

    $text = "Hello World";

    $zipped = "\x1f\x8b\x08\x00\x00\x00\x00\x00";
    $zipped .= substr(gzcompress($text, 2), 0, -4);

    $fp = fopen("$_SERVER[DOCUMENT_ROOT]/forum/mycache/forum/test.zip", 'w');
    flock ($fp,2);
    fwrite($fp, $zipped);
    flock ($fp,3);

If I want to send it out with php I do:

    header('Content-Encoding: gzip');
    print rtrim(readfile("$_SERVER[DOCUMENT_ROOT]/forum/mycache/forum/test.zip"),"\n\r");

So how should I do it now? Writing the header into the .zip file, send the header with mod-rewrite? Right now I see binary output in my browser if I try to read the zipped file with mod_rewrite.

Thanks in advance

12:19 am on Feb 23, 2004 (gmt 0)

New User

10+ Year Member

joined:Feb 18, 2004
votes: 0

As promised, I just wanted to let everyone know about my "esoteric" cache 'n' zip rewrite rules.

First of all I finally found a direct way to check the browsers ability to handle zipped content. It's the

variable I had overlooked before, similar to what Jim had suggested. Secondly, we need to tell the browser about the zipped nature of the content (encoding) in case we deliver a zipped file. And finally we should tell the browser what kind of content (text/html) we deliver.

All together we get a code like:

    AddEncoding x-gzip .zip
    AddType text/html .zip
    AddType text/html .txt

    RewriteEngine On
    RewriteBase /forum

    RewriteCond %{HTTP:Accept-Encoding} gzip
    RewriteCond %{REQUEST_URI} ^(.*)/$
    RewriteCond /path/to/root/mycache%1.zip -f
    RewriteRule ^.*$ /mycache%1.zip [L]

    RewriteCond %{HTTP:Accept-Encoding}
    RewriteCond %{REQUEST_URI} ^(.*)/$
    RewriteCond /path/to/root/mycache%1.txt -f
    RewriteRule ^.*$ /mycache%1.txt [L]

    RewriteRule ^([^/]+)/$ forumlist.php?myforum=$1 [L]
    RewriteRule ^([^/]+)/(\d+)/$ thread.php?myforum=$1&mythread=$2 [L]

Alternatively, one could replace the last two (or more) lines by a redirect to a wrapper script, which decides what to do and which cares for the newly creation of cached file versions.

    RewriteRule ^(.*)$ /forum.php?$1 [L]

I'm using these settings now and experience a slight but noticeable reduction in reaction time. More important is the reduction of the server load which has remarkably reduced the traffic jam (CGI limits reached) in my websites rush hours.

Good luck.

PS: I was not able to use

in my rewrite rules, instead I had to hardcode
, as this was quite different. I have no idea why
is different from PHP's
. And
did not show any value.
10:20 am on Mar 10, 2004 (gmt 0)

Junior Member

10+ Year Member

joined:Oct 26, 2002
votes: 0

Is there a special reason why you are not using the caching-classes for php that are available? There is for exampe jpcache and there are also two PEAR-Classes that offer caching functions.
I am very happily using jpcache. Adding it is very easy, with auto_prepend you won't even have to edit your scripts...
3:39 pm on Mar 12, 2004 (gmt 0)

New User

10+ Year Member

joined:Feb 18, 2004
votes: 0

Is there a special reason why you are not using the caching-classes for php that are available? There is for exampe jpcache and there are also two PEAR-Classes that offer caching functions.
I am very happily using jpcache. Adding it is very easy, with auto_prepend you won't even have to edit your scripts...

There are several ways to do that with Perl and PHP. Unfortunately, on my old server both Perl and PHP had been invoked as a CGI and not as an apache module. Thus, on my heavy traffic forum I had a lot of problems with CGI limits. It was my assumption, that doing the job without starting PHP for every single page request would improve my performance.

Now I got a better server with 4 times RAM and Perl/PHP as apache modules. Looking back, I cannot definitely say whether my caching mechanism was improving the performance on the old server. The time to judge was just too short. But I am still curious, whether serving the cached pages would be a win in performance in this situation.

The downside of the described solution: I cannot control the headers on my managed server when reading files directly (this would be different if I could change the settings of my apache, but unfortunately I cannot). So some pages are cached on the user side or by proxies in some cases where I don't want them to do. Meta tags didn't really help me here as they are frequently ignored by e.g. proxies.

PHP would allow my to influence the header, so I'm thinking of using the PHP built in mechanisms you mentioned.