Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Why does my curl loop breaks after 100 cycles?

4:37 pm on Mar 20, 2014 (gmt 0)

Preferred Member from GB 

10+ Year Member Top Contributors Of The Month

joined:July 25, 2005
votes: 10


I've got a series of scripts one of which communicates with an external source via curl.

Let's say it has to download 1200 snippets of data every day. There's a script that populates a database with the 1200 urls that will have to be visited today.

Then a simple curl function loops through those urls and saves data back into the database.

Because there are several services communicating with the data source, I've introduced a 10 second pause between the requests.

Unfortunately, after looping through 70 - 100 urls, the script suddenly stops. It's not the case of the source blocking my connections. So, why is it breaking and how do I make sure it loops through the entire 1200 urls?

function get_data($url)
blah blah blah - a simple curl function...

$database = "***";
mysql_connect("***", "***", "***") or die("Error connecting to database: ".mysql_error());
mysql_select_db($database) or die(mysql_error());

$sql = mysql_query("SELECT * FROM `list`) or die(mysql_error());

//while loop here
while($row = mysql_fetch_array($sql))
{//opens while brackets
$url = $row['url'];
$id = $row['id'];
$data = get_data($url);
$clean_data = mysql_real_escape_string($data);

$sql2 = "UPDATE `list` SET raw='$clean_data' WHERE id='$id'";
$result2 = mysql_query($sql2);

Maybe there's a way to detect the loop has stopped and pick it up from the next empty table row?
5:35 pm on Mar 20, 2014 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2002
votes: 1

Are you doing this via the command line? A browser can timeout too if you're not.

Also make you you have a timeout in your get_data curl settings.

Your script doesn't show whether you use standalone curl or PHP's library for it. If you're using standalone make sure to pass the flag --globoff as URLs like http://www.example.com/[1-1000].htm would make curl fetch the page 1000 times over.

Add some error checking and that may give you a better idea.
12:02 pm on Mar 24, 2014 (gmt 0)

Preferred Member from GB 

10+ Year Member Top Contributors Of The Month

joined:July 25, 2005
votes: 10

Thank you for the answer. I'm using a browser, you're right. It times out. I have tried to increase the timeout via regedit but it hasn't changed anything, which makes you wonder why there are those values in regedit if they don't make a difference :)

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members