Forum Moderators: phranque

Message Too Old, No Replies

Database driven website - Converting to static

Building static pages from a large database for better performance.

         

scumm_bar

1:27 pm on Jun 24, 2003 (gmt 0)



What are the performance advantages/disadvantages of building static pages from database queries and serving them to visitors instead of the on-the-fly result?

I have a php file - mypage.php?id=xxxx (5000 possible id's) which receives 40000 views/day over those 5000 id's. Now, each time the visitor requests one of these pages (id's), it does a query for one thing (title), another for some other thing (related articles) etc, and displays the information to them. At the end of the day, there has been some 160000 queries processed (let's say 4 queries per view) and displayed to the end-user.

So, let's take a php script - or other means - to visit these pages, take the page source, and make a mypagexxxx.html with it (mypage50.html, mypage2467.html etc). I can use cron job to do this periodically. Instead of 160000 queries/day, we can set this script to do a run of 20000 queries every second sunday, creating static html files for each id, which can now be served to the visitor.

---------
The php script (before) would look like this:
---------
<?php
$result = mysql_query($sql ,$db)
include('header.html');
print("$result");
include('footer.html')
?>
---------
The static html file (after) would look like this:
---------
<HEADER> (no includes necessary)
<CONTENT> (previously the query result)
<FOOTER> (no includes necessary)

Now beforehand, using the php pages, the header and footer were static seperate files which were included into each page.

Afterwards, the header and footer go in with the content, written out seperately on each page. I needn't update the content of the html files directly (I just prepare for the next cron job) so this is fine for me in terms of managing changes.

Something I've also considered - generating .html files of only the top 500 visited pages (the bulk). I could then use mod_rewrite to check whether the html file exists (and serve it) or otherwise use the regular php-based method of fetching the result of the query for that page.

Here's the questions:
1. Is there 'caching' or any other advantage gained by inserting the header and footer from static header and footer files. (as the php pages do)
2. I don't need to manage the content of the generated html files, so am I correct in thinking that using SSI to include the header and footer in each .html page would be unnecessary. (Again, any caching advantages involved?)
3. Using the mod_rewrite method as above, does checking for whether files exist with mod_rewrite decrease performance (noticeably).
4. On a sidenote, if the header and footer files contain html, is it wise using .txt includes instead of .html (I read .txt gets handled faster by the server)

I apologise if I've not explained it well - please tell me if so. Also, I've checked for these answers all over WebmasterWorld (and the web), and couldn't find a bean. Therefore, any help would be greatly appreciated.

txbakers

1:37 pm on Jun 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would stick with the dynamic pages. Why clutter up the server with 2000 static pages?

jaski

1:39 pm on Jun 24, 2003 (gmt 0)

10+ Year Member



I think PHP/MySQL work pretty fast esp. for read only queries. You may first want to check if queries can be made faster .. have proper index etc.

jamie

1:48 pm on Jun 24, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



you could also look into caching system for mysql queries

and make sure you return correct http headers - this helps with browsers and caching.

scumm_bar

1:53 pm on Jun 24, 2003 (gmt 0)



Thanks for posting guys.

I'm looking for a way of reducing stress on the server, and maybe reducing bandwidth.

txbakers,

If I clutter up the server with 2000 static pages, does that mean - at the end of the day - I'm going to be doing the opposite to what I'm looking to achieve?

jaski,

I've been optimizing and creating indexes on columns, and it makes a difference, but the server load has still been increasing greatly over the last few months, in par with the amount of visitors.

jamie,

Thanks. I've looked briefly over this, but wasn't sure. I'll look further. Any idea whether caching mysql queries works on the greater scale of things? (5000 id's)

dwilson

2:00 pm on Jun 24, 2003 (gmt 0)

10+ Year Member



txbakers, disk space is cheaper than processor power. scumm_bar's concept is good for sites that aren't hugely customized. If the pages vary per user, that's another issue. It's become popular enough that it's a selling point of IIS6 which does more or less what's being proposed here.