Forum Moderators: coopster

Message Too Old, No Replies

Best Methods For Caching Files to Disk

PHP dumping to static.html and file locking

         

trillianjedi

9:07 am on Aug 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I have a bigish SQL query executed from example.php that produces something on a homepage and takes a little while to execute (1/2 - 3/4 second).

The data changes about 50 times with every 5,000 page views, so prime target for caching. My idea is to have the query run only when needed (so when change_happening.php is executed) and then dump the result to disk in HTML ready to simply be included into the main page.

Quite a straightforward thing, but I'm not sure on the best way to do lock out other Apache threads while this is happening, or if I even need to worry about it (perhaps the Linux OS handles file locks on a rewrite anyway?).

So far I'm thinking that change_happening.php would work like this:-

1. Run the Query and generate the HTML
2. Open the file "data.html" for rewrite and dump in the generated HTML data.
3. Close the file

example.php would then simply have in the relevant section

1. include("data.html");

What happens if an Apache thread is trying to run example.php and trying to include data.html at the same instant that I'm trying to re-write it? Does Apache/Linux OS automatically lock the file once it's been opened for re-write, thereby queuing this thread behind the lock until it's ready, or do I need to explicitly lock the file?

We had an interesting thread in here about FLock and it's explicit release here (thanks for the input Coop and Adam):-

[webmasterworld.com...]

.... but here the difference is I'm opening the file to be rewritten rather than appended.

I'd prefer not to use FLock, as I don't like the way that it's implemented in PHP (no final guarantee of lock release).

I suppose the other thing I could do is this:-

1. Rename existing data.html to data.old
2. Do the work
3. Rename data.old to data.html

And then example.php would do:-

1. If File_Exists(data.html) { include it }

So worst case scenario here is, in event of extreme timing coincidence (pretty rare bearing in mind a 1/2 - 3/4 second "window" in which this can happen) this viewer simply misses this particular block out altogether, but gets the rest of the page.

Any thoughts? Any "best way" approaches to this kind of thing?

Thanks,

TJ

whoisgregg

1:37 pm on Aug 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could dramatically reduce the amount of time that data.html is unavailable if you use this sequence:

1. Write new data to data.new.html
2. Rename data.html to data.old.html
3. Rename data.new.html to data.html
4. Delete data.old.html

Renaming existing files is fast.

trillianjedi

2:04 pm on Aug 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Thanks, yes that's a much better idea.

I'm still a bit confused as to whether I need to worry about locking or not though?

whoisgregg

9:12 pm on Aug 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'm not sure how file locking would work in this scenario. Another way to address it if it is a problem but if the data absolutely must be included on each page would be to give the page a few chances to get the file before erroring.

Something like this completely untested pseudocode:

$slept_counter=0;
while(!file_exists($path_to_file)){
usleep(100000); //sleep for 1/10 of a second
$slept_counter++;
if($slept_counter>=10) die; // well, call a "send error" function
}
include($path_to_file);

This code would have it's own problems, but it gives you an idea of what I mean. :)