Forum Moderators: coopster

Message Too Old, No Replies

copy() from remote server times out

         

csdude55

10:25 pm on Jan 1, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm writing a script that should copy 12 files (711MB total) from a remote server to mine through a cron that runs once a day. The problem I'm having is that it times out before completing!

Here's the script:


$baseURL = 'http://www.example.com/';

$filenames = array(
'one',
'two',
...
'twelve'
);

foreach ($filenames as $key) {
$file = $baseURL . $key . '.zip';
$newfile = $key . '.zip';

if (!copy($file, $newfile))
file_put_contents('error.log', "Copy failed: $key\n", FILE_APPEND | LOCK_EX);
}

echo "Done";


I tried running it at 4am this morning, when server traffic was the lowest. It ran for about 20 minutes and copied 276MB worth of data, but then stopped. I first tried running it through the browser, then when it timed out I tried running it via Putty, but it timed out in both.

Any suggestions? My only thought was to create 12 different scripts to run at 12 different times, but I'm not sure how to make the system wait until the first download is complete before attempting the second.

TIA!

robzilla

11:14 pm on Jan 1, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Did you check your web server error logs? Perhaps PHP is running out of memory.

whitespace

11:21 pm on Jan 1, 2016 (gmt 0)

10+ Year Member Top Contributors Of The Month



How many files are downloaded successfully? Is it failing in the middle of a file? Do all the source files exist? Have you tried running the script on each file separately - in case there is an error with a specific file (its size perhaps)? What specific errors are you getting?

csdude55

12:02 am on Jan 2, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I was able to download each of the files to my PC without any problems, so I know that the files exist on the remote server. I didn't check in my script that the file existed, though, and that's a good point, so that's the first change I'll make in the script:


if (file_exists("$file")) {
if (!copy($file, $newfile))
file_put_contents('error.log', "Copy failed: $key\n", FILE_APPEND | LOCK_EX);
}

else
file_put_contents('error.log', "File does not exist: $key\n", FILE_APPEND | LOCK_EX);


My server is Linux, so the filesize is in 1,000 bytes where my PC is 1024. That makes it a little bit of a pain to see if the entire file downloaded, but here's my total list:


File Server File Size PC File Size (in kb)
one 0 39,134 (not complete)
two 87,314 85.2 (complete)
three 0 39,270 (not complete)
four 17,519,861 17,110 (complete)
five 0 16,417 (not complete)
six 61,095,547 59,664 (complete)
seven 0 97,452 (not complete)
eight 34,609,660 33,799 (complete)
nine 111,373,661 108,764 (complete)
ten 0 244,383 (not complete)
eleven 51,527,730 68,015 (not complete)
twelve 0 3,410 (not complete)


Considering that some of the filesizes are 0, I'm making the assumption that the file was found but not copied for whatever reason. They're all going to the same directory so it shouldn't be an issue with permissions, but it's odd that the first file in the list didn't download at all when the second did?

It looks like all of the files that tried to copy were completed EXCEPT for "eleven", which stopped in the middle. So I'm guessing that it intended to go back to "one" later, but then timed out in the middle of "eleven"?

The server I'm using runs cPanel, and it's a GoDaddy virtual server. The error log in cPanel doesn't show any PHP errors at all, and the dummy error.log that I created in the script just shows that the copy failed on the files that have a 0 filesize.

Robzilla, that's a good point, too. The server's default memory limit is 32MG, so I'll change that on the server to -1 (no limit) tonight and test again.

If it still fails, I'll try running it one script at a time.

I'll report back either way.

robzilla

10:23 am on Jan 2, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Perhaps a CURL wrapper instead of the copy function will be more reliable for downloading large files, since you can use CURLOPT_FILE to save straight to disk, and generally have many more options to control the transfer compared to copy.

csdude55

11:17 am on Jan 2, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Maybe you're right, Rob.

Here's what I've done so far, and the results:

1. I changed memory_limit to -1 (no limit), but that seemed to have no impact. The copy still stopped in the middle of "eleven".

2. I modified the program to JUST download "eleven" and "twelve" (which had been a 0 so far), and that worked just fine. So the problem really seems / seemed to be with the script timing out, not a problem with those files.

3. BUT, I seem to be having an unrelated problem with "one", "three", "five", "seven", and "ten". I modified the program with file_exists() as noted above, but the error.log file still just shows "Copy failed". So the script finds them, it just doesn't copy them.

So I modified the error.log line to:


$err = error_get_last();
file_put_contents('error.log', "Copy failed: $key " . $err['message'] . "\n", FILE_APPEND | LOCK_EX);


Then I ran it again, and "one" copied! But then I ran it again for testing, and "one" went back to 0. The error.log file, though, still just showed "Copy failed", with nothing for $err['message'].

Then I ran it again, and it copied again!

So the problem here seems to be intermittent. Maybe the remote server isn't responding for a second and I have to catch it at just the right time? I'm not sure because I can't seem to find any legit error messages, but it appears that I'll have to somehow test to see if the file completely downloaded, and if not try again a few minutes later?

Before getting that complicated with it, I'll try to rewrite using CURL as suggested, and see if that works any more reliably.

jmccormac

11:31 am on Jan 2, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Does your server have some kind of PHP script execution time limit? If so could that be terminating the script as it exceeds the limit. The default, I think is set at about 30 seconds in /etc/php.ini (Just a guess.)

Regards...jmcc

csdude55

11:16 am on Jan 3, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Jmcc, I looked through the PHP Configuration Editor in WHM/cPanel and found "max_execution_time", and like you suggested it was set at the default of "30". But the script has been running for about 20 minutes, so if that directive does what I assumed then I really can't understand why the script is working at all?

Either way, though, I rewrote the script using CURL, as suggested by Rob, and so far it seems like it's working! I haven't opened the files to make sure that they don't have any errors, but it ran a LOT faster (745MB copied in 34 minutes), and no 0 filesize!

I replaced the foreach() in the original script with this:


foreach ($filenames as $key) {
$file = $baseURL . $key . '.zip';
$newfile = $key . '.zip';

if (file_exists($file)) {
$ch = curl_init($file);
$fp = fopen($newfile, "w");

curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_HEADER, 0);

curl_exec($ch);

/* Should this replace the above line for error logging? */
// if (curl_exec($ch) === false)
// file_put_contents('error.log', "Curl error: $key -> " . curl_error($ch) . "\n", FILE_APPEND | LOCK_EX);

curl_close($ch);
fclose($fp);
}

else
file_put_contents('error.log', "File not found: $key\n", FILE_APPEND | LOCK_EX);
}


I haven't had a lot of experience with CURL so I wasn't sure how to log any errors, or whether there was a better way to copy multiple files without opening and closing each time (I may try to rewrite using curl_multi_init() just for the sake of learning). So any suggestions on improvements would be appreciated, but otherwise this seems to be doing the trick :-)

Thanks, all!

robzilla

10:48 pm on Jan 3, 2016 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Best tool for the job :-) Glad you got it working.

There's no time limit on script execution if you're running it from the command line, but it should time out if you call it via the web server.