Forum Moderators: phranque

Message Too Old, No Replies

.htaccess optimization, and gzip

Help on using the less possible resources

         

cYbErDaRk

10:42 pm on Apr 30, 2007 (gmt 0)

10+ Year Member



I'm trying to optimize this code for my CMS. It produces .html and .jgz (html compressed) on demand via php if they're not already present on the hard-drive, and sends it to the client (with the appropiate encoding). CMS backend deletes determinate files as administrators insert/update/delete the contents of the website.

It seems it's fast (have other versions of the config running in other sites with no problems), but I'd like to ask you if you think there's something to change to speed up the website. Also I've not been able to add some of the 'stop looping' snippets I've seen in other sites.

Here are the header and some relevant parts of the .htaccess. All the rest acts the same way.

Thanks in advance.

cYbErDaRk

RewriteRule \.(png¦gif¦jpe?g¦css¦js¦php)$ - [NC,L]

RewriteCond%{REQUEST_URI} ^/$
RewriteRule ^.*$ /tienda/

RewriteCond %{REQUEST_URI}^/tienda/
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(.*)$jgz/$1
RewriteCond %{REQUEST_URI}^/tienda/
RewriteCond %{HTTP:Accept-Encoding}!gzip
RewriteRule ^(.*)$html/$1

RewriteRule^(html¦jgz)/tienda/$/index.php[E=T:$1,C]

RewriteCond%{DOCUMENT_ROOT}/xcache/index.%{ENV:T}-f

RewriteRule ^.*$xcache/index.%{ENV:T}[L]

RewriteRule ^(html¦jgz)/tienda/se/([^/]*)/([^/]*)/?$ /$1/tienda/se/$2/$3/0/[L]

RewriteRule ^(html¦jgz)/tienda/se/([^/]*)/([^/]*)/([^/]*)/?$ /seccion.php?cod=$2&desp=$4 [E=T:$1,E=A:$2,E=B:$4,C]

RewriteCond %{DOCUMENT_ROOT}/xcache/seccion-%{ENV:A}-%{ENV:B}.%{ENV:T} -f

RewriteRule ^.*$ xcache/seccion-%{ENV:A}-%{ENV:B}.%{ENV:T} [L]

g1smd

12:10 am on May 2, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I really wish that each line of code was preceded by a comment or description of what the next line is supposed to actually do.

This would help people who already know a lot to debug that code, and verify the expected functionality, and it would help beginners to this subject understand a bit more as to how this stuff works.

cYbErDaRk

1:40 am on May 3, 2007 (gmt 0)

10+ Year Member



Hi g1smd

I've been, since I wrote this code, a few hours (6?) trying to improve it.

I post my last results with an explanation of what it does (or should :)).

It seems this final version works ok, but I'd like to know if someone finds something "buggy" or any possiblity of looping. Any idea of how to not using the rewrite map?

Thanks for your time.

David


RewriteEngine On

# ignore static content and other non-cacheable content
RewriteRule \.(png¦gif¦jpe?g¦css¦js¦php)$ - [NC,L]
RewriteRule ^buscar/?$ /buscar\.php$1[QSA,L]

...

# Trick one. Detect user's ability to support gzip. Rewrite it for later processing
# URLS initially starting with /tienda/ will be the only ones to use this mechanism

RewriteCond %{REQUEST_URI} ^/tienda/
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(.*)$jgz/$1 [L]
RewriteCond %{REQUEST_URI} ^/tienda/
RewriteCond %{HTTP:Accept-Encoding}!gzip
RewriteRule ^(.*)$html/$1 [L]

# ... try to find if there're a file called like the static+original URL (see later for RewriteMap explanation) and .html or .jgz, according to user's abilities
# The "f" is added to avoid recursion
RewriteRule ^(html¦jgz)/(.*)$ f/$1/$2[E=T:$1,E=A:$2,C]
RewriteCond %{DOCUMENT_ROOT}/xcache/static__${p:%{ENV:A}}.%{ENV:T} -f
RewriteRule ^.* /xcache/static__${p:%{ENV:A}}.%{ENV:T} [L,NS]

# Final URL example
# If there's no static content already written, pass control to php (which will write both the .html and .tgz file, with "static" and the beggiinig of the filename)

# ... I usually mark diferent actions ("view product", "view sections", etc) with a different url ("/se/","/p/",etc)
# Pass to php, with trailing slash or not. PHP takes care of it (301 if not present)
RewriteRule ^f/(html¦jgz)/tienda/se/([^/]*)/([^/]*)/([^/]*)/?$ /seccion.php?cod=$2&desp=$4 [L]
RewriteRule ^f/(html¦jgz)/tienda/s/([^/]*)/?$ /$2\.php [L]

... and more

----------------

The rewrite map:

(on sites-enabled/001-client)

RewriteMap p prg:/home/client/www/p.pl

----------

p.pl:

#!/usr/bin/perl
$¦ = 1;

while(<STDIN>) {
s/\//_/g ;
print $_;
}

Explanation of how it works:

URLs beggining with /tienda/ are, first, rewritten to html/tienda/... or gzip/tienda/... deppending user's ability to suppoer gzip or not
Then, all "/" are substituted with "_". This helps me to store all the static content on the same directory (xcache).

If present, serve it.

If not, pass to PHP, which renders the page to the user and also writes the html and jgz to the xcache dir. Jgz files are managed by Apache as text/html, z-gzip -> AddType text/html .jgz, AddEncoding x-gzip .jgz

The file PHP writes is the request_uri, with "/" replaced with "_". Static content are deleted via admin (php), when updating/deleting/inserting records, or by hand the all the cache has to be flushed. Just write static content on request, not the very whole site/section/product each time something changes in the DB.

Example final URL:

[client.com...]

Static content written:

/xcache/static__tienda_se_1_futbol_.html
/xcache/static__tienda_se_1_futbol_.jgz

[edited by: cYbErDaRk at 1:48 am (utc) on May 3, 2007]

cYbErDaRk

1:43 am on May 3, 2007 (gmt 0)

10+ Year Member



Just a note... sorry for my English and some misspelling :)

jdMorgan

2:26 pm on May 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Here are a few efficiency tweaks: Test them one at a time!

RewriteRule ^$ /tienda/

RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(tienda/.*)$ /jgz/$1 [S=1]
RewriteRule ^(tienda/.*)$ /html/$1

RewriteRule ^(html¦jgz)/tienda/$ /index.php [E=T:$1,C]
RewriteCond %{DOCUMENT_ROOT}/xcache/index.%{ENV:T} -f
RewriteRule ^index\.php$ /xcache/index.%{ENV:T} [L]

RewriteRule ^(html¦jgz)/tienda/se/([^/]+)/([^/]+)/?$ /$1/tienda/se/$2/$3/0/ [L]

RewriteRule ^(html¦jgz)/tienda/se/([^/]+)/([^/]+)/([^/]+)/?$ /seccion.php?cod=$2&desp=$4 [E=T:$1,E=A:$2,E=B:$4,C]
RewriteCond %{DOCUMENT_ROOT}/xcache/seccion-%{ENV:A}-%{ENV:B}.%{ENV:T} -f
RewriteRule ^seccion\.php$ xcache/seccion-%{ENV:A}-%{ENV:B}.%{ENV:T} [L]


Changes include getting rid of unnecessary RewriteConds, and replacing ".*" patterns in chained RewriteRules with specific patterns to avoid doing unnecessary "file exists" checks, which are slow. The "([^/]*)" patterns were also changed to "([^/]+)" patterns to prevent some double-slash linking exploits. This change means that at least one non-slash character is needed between each slash to invoke the rule.

Change all broken pipe "¦" characters in the code above to solid pipe characters before use; Posting on this forum modifies the pipe characters.

Jim

cYbErDaRk

4:31 pm on May 3, 2007 (gmt 0)

10+ Year Member



Hi jpMorgan. First, thanks a lot for the comments, they're very useful.

As I see yo've focused in the first example. Let's see if I can write it with the second one.

On [^/]+ I'd like to mention that, even if // is passed to PHP, there it'd be checked and controlled.

Will try the code and tell you.

Thanks!

RewriteEngine On

RewriteRule \.(png¦gif¦jpe?g¦css¦js¦php)$ - [NC,L]
RewriteRule ^buscar/?$ /buscar\.php$1[QSA,L]

RewriteCond %{REQUEST_URI} ^/tienda/
RewriteCond %{HTTP:Accept-Encoding} gzip
# ... I will have to study this one (tienda/.*) :)
RewriteRule ^(tienda/.*)$ /jgz/$1 [S=1]
RewriteRule ^(tienda/.*)$ /html/$1

# I need tienda/, according to my php model, to be in $2
RewriteRule ^(html¦jgz)/(.*)$ f/$1/$2 [E=T:$1,E=A:$2,C]
RewriteCond %{DOCUMENT_ROOT}/xcache/static__${p:%{ENV:A}}.%{ENV:T} -f
RewriteRule ^.* /xcache/static__${p:%{ENV:A}}.%{ENV:T} [L,NS]

RewriteRule ^f/(html¦jgz)/tienda/se/([^/]+)/([^/]+)/([^/]+)/?$ /seccion.php?cod=$2&desp=$4 [L]
RewriteRule ^f/(html¦jgz)/tienda/s/([^/]+)/?$ /$2\.php [L]

[edited by: cYbErDaRk at 4:41 pm (utc) on May 3, 2007]

jdMorgan

5:20 pm on May 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not sure what changes you've made, but I'll explain this one:

Original code:


RewriteCond %{REQUEST_URI}^/tienda/
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(.*)$ jgz/$1
RewriteCond %{REQUEST_URI}^/tienda/
RewriteCond %{HTTP:Accept-Encoding}!gzip
RewriteRule ^(.*)$ html/$1

Here, the RewriteCond %{REQUEST_URI} lines are not needed, since REQUEST_URI can be checked by RewriteRule itself. So, we just make the test for "tienda/" explicit in the rule:

RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^(tienda/.*)$ jgz/$1
RewriteCond %{HTTP:Accept-Encoding}!gzip
RewriteRule ^(tienda/.*)$ html/$1

Not that in .htaccess, the leading slash is stripped from the requested URI "seen" by RewriteRule. That is why it has been removed from the pattern you used in the RewriteConds.

The function does not change, it is just more efficient.

But now we have two mutually-exclusive rules, since Accept-Encoding will either contain "gzip" or it won't. So there is no need to check it twice. So, we make an "If-then-else" construct to avoid that:


# IF gzip supported
RewriteCond %{HTTP:Accept-Encoding} gzip
# THEN rewrite /tienda/ requests to jgz/ and skip next rule
RewriteRule ^(tienda/.*)$ jgz/$1 [b][S=1][/b]
# ELSE rewrite /tienda/ requests to html/
RewriteRule ^(tienda/.*)$ html/$1

Again, no change in function, just more efficient, since HTTP:Accept-Encoding is only checked once.

Finally, making everything relative to root:


# If gzip supported
RewriteCond %{HTTP:Accept-Encoding} gzip
# rewrite /tienda/ requests to /jgz/ and skip next rule
RewriteRule ^(tienda/.*)$ /jgz/$1 [S=1]
# Else rewrite /tienda/ requests to /html/
RewriteRule ^(tienda/.*)$ /html/$1

The same was done for all rules that had RewriteCond %{REQUEST_URI} and a ".*" pattern in the RewriteRule.

Be aware that RewriteRules are processed first. If the pattern does not match, then the RewriteConds for that RewriteRule are not processed. Therefore, the pattern in the rule should always be as specific as possible. See the RuleSet Processing [httpd.apache.org] section of the mod_rewrite documentation for more details. Using this technique, especially when RewriteConds involve system calls such as "file exists" or reverse-DNS lookups such as "RewriteCond %{REMOTE_HOST}", can make a big difference in the performance of your server.

Jim

cYbErDaRk

5:42 pm on May 3, 2007 (gmt 0)

10+ Year Member



Thanks again, I think I'm now learning how mod_rewrite works :)

About the whole thing itself, the idea if having just one file-check rule, that covers all the possible static contents of the website. Later, if file is not present, pass the control to PHP.