Forum Moderators: phranque

Message Too Old, No Replies

Configure server to serve pre-generated gzip pages

         

ocon

7:37 pm on May 1, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



Originally my homepage was an index.php file because my content changed daily. However, because I get a lot of traffic to this page I wanted to speed it up.

I created a script to generate the homepage as an index.html file and set a cronjob to run this script once a day with. Now the page is faster, but I want to do better.nI extended this script to now generate an index.html file as well as an index.html.gz file.

I set my .htaccess file to serve it when it can, but I think I'm getting this code wrong because its prompting users to download the homepage instead of viewing it in their browser.

.htaccess:

RewriteEngine On
RewriteBase /

RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^/?$ /index.html.gz [L]


Any help is greatly appreciated.

Of note, this is the only page on my site.

robzilla

7:49 pm on May 1, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It seems to me you're missing the AddEncoding directive, which "maps the given filename extensions to the specified encoding type" [httpd.apache.org].

RewriteEngine On
RewriteBase /
AddEncoding gzip .gz
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^/?$ /index.html.gz [L]

Added: you may have to forcibly set the content-type to text/html, as such:

RewriteEngine On
RewriteBase /
AddEncoding gzip .gz
ForceType text/html
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^/?$ /index.html.gz [L]

Despite the fact that index.html is all you've got (no images at all? no external javascript or stylesheet?), you may want to limit the rules to either just index.html or all .html files to prevent unnecessary rewrites for files such as robots.txt (regardless of whether or not you have one).

RewriteEngine On
RewriteBase /
<FilesMatch "\.(html.gz)$">
AddEncoding x-gzip .gz
ForceType text/html
</FilesMatch>
<FilesMatch "\.(html)$">
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^/?$ /index.html.gz [L]
</FilesMatch>

I haven't actually tested this, so let me know how it works out.

ocon

9:44 pm on May 1, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



Thank you very much! Unfortunately the third option didn't work but the second one did.

I do have some concerns though. It seems now like everything is being outputted as text/html, such as my robots.txt file. I'm assess that the AddEncoding and ForceType lines seem to always be implemented. I'm sure there will be other problems because of these two rules, but I haven't come across them yet.

Also, the RewriteRule seems to only apply if I go to domain.tld, domain.tld/, but not domain.tld/index.html. In that third case, the page is not being sent compressed.

robzilla

2:27 pm on May 2, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I had some time to test this for you; the above examples are messy, and should perhaps not have been posted at all. Having said that, the following ought to work for you:

RewriteEngine On
RewriteBase /
AddEncoding x-gzip .gz
<Files index.html.gz>
ForceType text/html
</Files>
RewriteCond %{HTTP:Accept-Encoding} gzip
RewriteRule ^index.html index.html.gz [L]

As you can see, I've limited the ForceType directive to <index.html.gz> so that other files are not affected, and I also simplified the RewriteRule since all you're looking to rewrite is <index.html>.

robzilla

2:48 pm on May 2, 2011 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Some other tips on improving performance for this one-page site:
  • Consider moving the above rules to httpd.conf if you have access to it, and then set the
    AllowOverride
    directive to
    None
    to stop Apache from looking for your .htaccess file [webmasterworld.com].
  • Make sure you create index.html.gz using the maximum possible compression (e.g.
    gzip -9
    ).
  • Strip your HTML code of unnecessary whitespace, comments, etc. (but keep a structured copy as a backup to allow for easy editing). Same goes for any inline CSS or Javascript you may have.
  • If you get a lot of repeat visits and/or reloads, line up the expires headers for index.html with the next moment at which you re-generate the file.
  • If index.html really is the only file requested from your server and you serve a lot of those requests, consider setting the
    Keep-Alive
    directive to
    None
    in httpd.conf for this particular site.

ocon

6:48 pm on May 2, 2011 (gmt 0)

10+ Year Member Top Contributors Of The Month



That's great, thank you. I'll test it out when I get home.

Unfortunately I don't have access to httpd.conf so I have to use the .htaccess file.

I'm already using gzip -9, but it's great to get feedback on this. I know it takes a little bit longer to compress than the default level; I was worried if it would also take longer to decompress in the browser.

Right now I am trying to figure out how to setup an expiration header to make the browser cache the page for a reasonable amount of time. Normally I would send it via a header inside the php document but since I'm delivering just an html file, I'm trying to figure out if I stick something somewhere in the .htaccess file.

This page does have some graphics on it, but I have them hosted on a CDN. I used PNGSlim to make them as optimized as I can.

It's not a one page site, but the homepage will definitely have a lot more traffic than the other, minor pages. The other pages have a different configurations so I won't be able to carry over some of the same optimization techniques. However, I am concerned that some of the optimization techniques on the homepage doesn't negatively impact the other pages, like the problem with the ForceType.

I do want to pass one variable to the page to pre-populated a field. I'm considering sending it in via an anchor tag ( example.com/#value ) and using JavaScript to insert the information into the field. Would this cause the browser to have to redownload the page (like it would if it was sent as example.com/?name=value ?).

Thank you again very much for all your help!