Forum Moderators: coopster
I have a script that grabs news headlines from remote RSS feeds, probabbly 20 of them.
In a loop I store all these headlines into my DB so I can display them on my site.
Script runs fine, it automatically fires every 5 minutes to check for new titles.
However, the problem occurs if the remote site is down and unavailable. When this happens my script juts hangs, trying to connect to it.
I have php time limit set to 5 minutes, but this is still a major problem.
Presumably because of this behaviour my server switched off yesterday and was off-line for 10-15 minutes. My hosting provider told me that a script (I'm almost certain it's this particular script) used 100% of CPU and that the server overloaded.
In order to avoid these problems in the future I've wrote a small piece of code to check if a remote RSS feed on site exists:
$url = 'http://www.example.com/rss.xml';
$responsed = get_headers($url);
preg_match("/HTTP\/1\.[1¦0]\s(\d{3})/",$responsed[0],$matches);
$status = $matches[1];
if($status=="200"){
echo 'it works !';
}else{
echo 'no good..';
}
This works as it should, BUT I'm not certain how will it behave if the external site goes down? Will it just throw a 404 or will this one hang as well? Then I have double problem :D
Could I somehow limit the execution time of this?
function getPage($url){
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL, $url);
curl_setopt($ch,CURLOPT_FRESH_CONNECT,TRUE);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT_MS,6000);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,TRUE);
curl_setopt($ch,CURLOPT_TIMEOUT_MS,6000);
$html=curl_exec($ch);
if($html==false){
curl_close($ch);
return "0";
}else{
curl_close($ch);
return $html;
}
}
// you call it like so..
$status=getPage($url);
if $status returned "0" you cancel your operations, as obviously the remote site is down and/or unreachable.
I've set timeout to 6 seconds and it works great : D
NOTE: timeout under 1 second (1000 miliseconds) did not work for me, so use values above that figure.