homepage Welcome to WebmasterWorld Guest from 54.167.75.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Why does my curl loop breaks after 100 cycles?
adder




msg:4655802
 4:37 pm on Mar 20, 2014 (gmt 0)

Hi,

I've got a series of scripts one of which communicates with an external source via curl.

Let's say it has to download 1200 snippets of data every day. There's a script that populates a database with the 1200 urls that will have to be visited today.

Then a simple curl function loops through those urls and saves data back into the database.

Because there are several services communicating with the data source, I've introduced a 10 second pause between the requests.

Unfortunately, after looping through 70 - 100 urls, the script suddenly stops. It's not the case of the source blocking my connections. So, why is it breaking and how do I make sure it loops through the entire 1200 urls?


<?php
set_time_limit(0);
function get_data($url)
{
blah blah blah - a simple curl function...
}

$database = "***";
mysql_connect("***", "***", "***") or die("Error connecting to database: ".mysql_error());
mysql_select_db($database) or die(mysql_error());

$sql = mysql_query("SELECT * FROM `list`) or die(mysql_error());


//while loop here
while($row = mysql_fetch_array($sql))
{//opens while brackets
$url = $row['url'];
$id = $row['id'];
$data = get_data($url);
$clean_data = mysql_real_escape_string($data);

$sql2 = "UPDATE `list` SET raw='$clean_data' WHERE id='$id'";
$result2 = mysql_query($sql2);
sleep(10);
flush();
}
?>

Maybe there's a way to detect the loop has stopped and pick it up from the next empty table row?
Thanks

 

brotherhood of LAN




msg:4655815
 5:35 pm on Mar 20, 2014 (gmt 0)

Are you doing this via the command line? A browser can timeout too if you're not.

Also make you you have a timeout in your get_data curl settings.

Your script doesn't show whether you use standalone curl or PHP's library for it. If you're using standalone make sure to pass the flag --globoff as URLs like http://www.example.com/[1-1000].htm would make curl fetch the page 1000 times over.

Add some error checking and that may give you a better idea.

adder




msg:4656678
 12:02 pm on Mar 24, 2014 (gmt 0)

Thank you for the answer. I'm using a browser, you're right. It times out. I have tried to increase the timeout via regedit but it hasn't changed anything, which makes you wonder why there are those values in regedit if they don't make a difference :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved