homepage Welcome to WebmasterWorld Guest from 54.204.79.235
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
Function to check if a site is down or not
turbohost




msg:3385366
 7:12 pm on Jul 3, 2007 (gmt 0)

Hi,

I am looking for a function to check if a remote url is down (giving a 404) or running (giving a 200) but I can't find one. Does someone know such a function? I think I'm looking for the wrong keywords in Google :-/

Turbo

 

turbohost




msg:3385368
 7:15 pm on Jul 3, 2007 (gmt 0)

Forgot to tell this in my previous post ... if the site is down then the function has to recognize this in just a few seconds. Let's say that the function cannot wait more than 7 seconds for a site that is down before going to the next site.

trillianjedi




msg:3385378
 7:29 pm on Jul 3, 2007 (gmt 0)

Something like this should do the trick. Check the PHP manual for "file_get_contents" for more info. This is off the top of my head, but should work for the gist of it.

<?php

function checkUrl($url) {
ini_set('default_socket_timeout', 7);
$a = file_get_contents($url,FALSE,NULL,0,20);
return ( ($a!= "") && ($http_response_header!= "") );
}

?>

Check that "default_socket_timeout" is in seconds and not milliseconds. I think it is, but I'm not sure.

TJ

[edited by: trillianjedi at 7:32 pm (utc) on July 3, 2007]

turbohost




msg:3385381
 7:32 pm on Jul 3, 2007 (gmt 0)

Thanks TJ

trillianjedi




msg:3385383
 7:35 pm on Jul 3, 2007 (gmt 0)

Oh, one word of warning, as I think of it, that function (if it works as written) will give 7 seconds to load the first 20 bytes of a page (no point downloading the whole thing).

If a page is < 20 bytes long, it will give a false negative...

turbohost




msg:3385385
 7:37 pm on Jul 3, 2007 (gmt 0)

I'm getting this error

Warning: file_get_contents() expects at most 2 parameters, 5 given in /home/#*$!x/public_html/backlinks.php on line 27

bcolflesh




msg:3385389
 7:38 pm on Jul 3, 2007 (gmt 0)

If you have the CURL library installed, you can try this method as well:

[jellyandcustard.com...]

trillianjedi




msg:3385391
 7:41 pm on Jul 3, 2007 (gmt 0)

Are you on an old version of PHP? I believe what I wrote is correct for PHP5, but it's off the top of my head. Look that function up in the manual.

If you're on PHP4, just use:-

$a = file_get_contents($url)

The downside is it will download the whole page, which is a waste of bandwidth if you're just checking if something is alive or dead.

You also need to check the manual for $http_response_header. Depending on your version of PHP it may not return an empty string for a 404. It might return full headers that you'll need to explode and check for a 404.

Added:-

Looking at your snippet I guess you're building a backlink checker, in which case you want to download the whole page anyway and RegEx for your domain inside <a href tags and ensure there isn't a no-follow or anything like that in there.

coopster




msg:3385407
 7:53 pm on Jul 3, 2007 (gmt 0)

And if you don't need the content from the page, just open a socket and send a HEAD as described in Checking if page exists [webmasterworld.com].

trillianjedi




msg:3386085
 1:35 pm on Jul 4, 2007 (gmt 0)

Yup, that would be cleaner Coop, although you should still set the socket_timeout if you need to know if the server is down, having had 7 seconds to respond.

turbohost




msg:3386400
 9:17 pm on Jul 4, 2007 (gmt 0)

OK, upgraded to the latest version of PHP and used TJ's code. It works fine but now I have another problem. When the site is not working I get an error message like below. Is there a way I can turn this off?

Warning: file_get_contents() [function.file-get-contents]: php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution in /home/#*$!/public_html/backlinks2.php on line 13

Warning: file_get_contents(http://example.com) [function.file-get-contents]: failed to open stream: Permission denied in /home/#*$!/public_html/backlinks2.php on line 13

turbohost




msg:3386408
 9:35 pm on Jul 4, 2007 (gmt 0)

OK, found it myself. I added "error_reporting(1);" at the beginning of the script.

Turbo

turbohost




msg:3386417
 9:54 pm on Jul 4, 2007 (gmt 0)

I'm testing the script with a list of sites and the script seems to be slow with inactive sites. Even if I set the timeout at 1 it takes sometimes more than 10 seconds before an echo "Site down" is generated. The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

Turbo

turbohost




msg:3387161
 6:55 pm on Jul 5, 2007 (gmt 0)

Someone? Coopster? TJ?

trillianjedi




msg:3387278
 9:33 pm on Jul 5, 2007 (gmt 0)

The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

No. That function works at the TCP/IP socket layer, so will be based on a timeout further down the stack than HTTP.

Sounds to me like it's hanging waiting for an ACK. TCP/IP is blocking in nature, so your app is forced to sit and wait for a response (or a timeout from the OS) before it can do anything. The timeout will come from the OS, but the very nature of TCP/IP (and design of it) is such that it is allowed to be "down" for a period of time. So the PHP timeout may not happen until the underlying OS has timed-out and it sounds to me like it's getting around 9 seconds.

It might be a good idea to go multi-threaded, but that would mean coding rather than scripting (I think - I've written anything multi-threaded in PHP so don't even know if it's possible).

coopster




msg:3387993
 6:57 pm on Jul 6, 2007 (gmt 0)

This is where the optional timeout argument to the fsockopen() [php.net] comes in handy as the timeout parameter to fsockopen() only applies while connecting the socket. If you need to set a timeout for reading/writing data over the socket, use stream_set_timeout().

trillianjedi




msg:3388004
 7:07 pm on Jul 6, 2007 (gmt 0)

The Coopster to the rescue!

That should do the trick for you turbo.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved