Welcome to WebmasterWorld Guest from 54.159.89.7

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Function to check if a site is down or not

     
7:12 pm on Jul 3, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


Hi,

I am looking for a function to check if a remote url is down (giving a 404) or running (giving a 200) but I can't find one. Does someone know such a function? I think I'm looking for the wrong keywords in Google :-/

Turbo

7:15 pm on July 3, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


Forgot to tell this in my previous post ... if the site is down then the function has to recognize this in just a few seconds. Let's say that the function cannot wait more than 7 seconds for a site that is down before going to the next site.
7:29 pm on July 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


Something like this should do the trick. Check the PHP manual for "file_get_contents" for more info. This is off the top of my head, but should work for the gist of it.

<?php

function checkUrl($url) {
ini_set('default_socket_timeout', 7);
$a = file_get_contents($url,FALSE,NULL,0,20);
return ( ($a!= "") && ($http_response_header!= "") );
}

?>

Check that "default_socket_timeout" is in seconds and not milliseconds. I think it is, but I'm not sure.

TJ

[edited by: trillianjedi at 7:32 pm (utc) on July 3, 2007]

7:32 pm on July 3, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


Thanks TJ
7:35 pm on July 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


Oh, one word of warning, as I think of it, that function (if it works as written) will give 7 seconds to load the first 20 bytes of a page (no point downloading the whole thing).

If a page is < 20 bytes long, it will give a false negative...

7:37 pm on July 3, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


I'm getting this error

Warning: file_get_contents() expects at most 2 parameters, 5 given in /home/#*$!x/public_html/backlinks.php on line 27

7:38 pm on July 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Feb 21, 2003
posts: 2355
votes: 0


If you have the CURL library installed, you can try this method as well:

[jellyandcustard.com...]

7:41 pm on July 3, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


Are you on an old version of PHP? I believe what I wrote is correct for PHP5, but it's off the top of my head. Look that function up in the manual.

If you're on PHP4, just use:-

$a = file_get_contents($url)

The downside is it will download the whole page, which is a waste of bandwidth if you're just checking if something is alive or dead.

You also need to check the manual for $http_response_header. Depending on your version of PHP it may not return an empty string for a 404. It might return full headers that you'll need to explode and check for a 404.

Added:-

Looking at your snippet I guess you're building a backlink checker, in which case you want to download the whole page anyway and RegEx for your domain inside <a href tags and ensure there isn't a no-follow or anything like that in there.

7:53 pm on July 3, 2007 (gmt 0)

Administrator

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:July 31, 2003
posts:12541
votes: 1


And if you don't need the content from the page, just open a socket and send a HEAD as described in Checking if page exists [webmasterworld.com].
1:35 pm on July 4, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


Yup, that would be cleaner Coop, although you should still set the socket_timeout if you need to know if the server is down, having had 7 seconds to respond.
9:17 pm on July 4, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


OK, upgraded to the latest version of PHP and used TJ's code. It works fine but now I have another problem. When the site is not working I get an error message like below. Is there a way I can turn this off?

Warning: file_get_contents() [function.file-get-contents]: php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution in /home/#*$!/public_html/backlinks2.php on line 13

Warning: file_get_contents(http://example.com) [function.file-get-contents]: failed to open stream: Permission denied in /home/#*$!/public_html/backlinks2.php on line 13

9:35 pm on July 4, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


OK, found it myself. I added "error_reporting(1);" at the beginning of the script.

Turbo

9:54 pm on July 4, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


I'm testing the script with a list of sites and the script seems to be slow with inactive sites. Even if I set the timeout at 1 it takes sometimes more than 10 seconds before an echo "Site down" is generated. The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

Turbo

6:55 pm on July 5, 2007 (gmt 0)

Full Member

10+ Year Member

joined:Aug 29, 2003
posts:244
votes: 0


Someone? Coopster? TJ?
9:33 pm on July 5, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

No. That function works at the TCP/IP socket layer, so will be based on a timeout further down the stack than HTTP.

Sounds to me like it's hanging waiting for an ACK. TCP/IP is blocking in nature, so your app is forced to sit and wait for a response (or a timeout from the OS) before it can do anything. The timeout will come from the OS, but the very nature of TCP/IP (and design of it) is such that it is allowed to be "down" for a period of time. So the PHP timeout may not happen until the underlying OS has timed-out and it sounds to me like it's getting around 9 seconds.

It might be a good idea to go multi-threaded, but that would mean coding rather than scripting (I think - I've written anything multi-threaded in PHP so don't even know if it's possible).

6:57 pm on July 6, 2007 (gmt 0)

Administrator

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:July 31, 2003
posts:12541
votes: 1


This is where the optional timeout argument to the fsockopen() [php.net] comes in handy as the timeout parameter to fsockopen() only applies while connecting the socket. If you need to set a timeout for reading/writing data over the socket, use stream_set_timeout().
7:07 pm on July 6, 2007 (gmt 0)

Senior Member

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:Apr 15, 2003
posts:7242
votes: 0


The Coopster to the rescue!

That should do the trick for you turbo.