The best way to run a 5 day PHP script from the CLI?

Forum Moderators: coopster

Message Too Old, No Replies

The best way to run a 5 day PHP script from the CLI?

CLI newbie - am I setting myself up for a fall?

inbound

12:50 pm on Nov 18, 2006 (gmt 0)

I'm looking for a little advice from seasoned PHP CLI users, any help would be much appreciated.

I've got a massive amount of data to process in a MySQL database. I'd like to run it from the CLI so that it can run until complete (if it breaks then it can be restarted from where it got to).

My previous data manipulation has all been triggered by browsers as the scripts were just being tested and it was easier that way. I'm sure they are solid and now just need time to run.

Are there any problems that I may encounter by running a script that will take days to complete? I'm thinking of resource issues, what happens if I need to stop the script...

The DB is likely to reach around 5GB, with around 100 million selects/updates required to get there (DB maintenance is another issue, but let's assume that everything is OK there).

Frank_Rizzo

1:06 pm on Nov 18, 2006 (gmt 0)

I think you need to split the process into smaller chunks - it's just too risky to leave something running for days.

How do you now if the procedure is working, completes, or just hangs? Are you doing somekind of checkpoint logging?

I don't know exactly what it is you are doing but I'd recommend splitting the procedure up and using distinct steps and log after each section.

e.g. For one database update routine I have I do something like this:

initialise.php (if ok log initialised)
import.php (if ok log import_success)
index.php (if ok log indexes_created)
magic_numbers.php (if ok log magic_numbers_generated)
consistency_check.php (if ok log database_consistent)

logging can be a simple pipe out to a log.txt file.

That's a simplistic view but I really think you need to sort out why / if your routine should take 5 days to complete. Something is not right with that.

inbound

1:43 pm on Nov 18, 2006 (gmt 0)

Thanks, splitting the tasks is an option, I'll look at that.

As for seeing if the process is working correctly, I can manually check the DB to see the correct type of data is being created. Thankfully it's not too difficult to verify that the process is working, it's just a lot of processing.

I'm interested to hear if people use PHP scripts that continuously run or if they just use cron jobs to trigger regular updates etc.

justageek

3:20 pm on Nov 18, 2006 (gmt 0)

I do it all the time. I just override the default 30 second timeout and let it go. The only problems I have ever seen is memory usage sometimes gets out of control but since I use a db to control where the script is when it is running I just resart any trouble scripts and they pick up where they left off.

A few ways I've used long running scripts is:

1. Run from a command window and output the progress to the screen.

2. Run from batch files and run behind the scenes.

3. Store each script in a db and I have an executable come along and fire them up based on a time to start field in the db.

JAG