Welcome to WebmasterWorld Guest from 107.20.5.156

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Function to check if a site is down or not

     

turbohost

7:12 pm on Jul 3, 2007 (gmt 0)

10+ Year Member



Hi,

I am looking for a function to check if a remote url is down (giving a 404) or running (giving a 200) but I can't find one. Does someone know such a function? I think I'm looking for the wrong keywords in Google :-/

Turbo

turbohost

7:15 pm on Jul 3, 2007 (gmt 0)

10+ Year Member



Forgot to tell this in my previous post ... if the site is down then the function has to recognize this in just a few seconds. Let's say that the function cannot wait more than 7 seconds for a site that is down before going to the next site.

trillianjedi

7:29 pm on Jul 3, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Something like this should do the trick. Check the PHP manual for "file_get_contents" for more info. This is off the top of my head, but should work for the gist of it.

<?php

function checkUrl($url) {
ini_set('default_socket_timeout', 7);
$a = file_get_contents($url,FALSE,NULL,0,20);
return ( ($a!= "") && ($http_response_header!= "") );
}

?>

Check that "default_socket_timeout" is in seconds and not milliseconds. I think it is, but I'm not sure.

TJ

[edited by: trillianjedi at 7:32 pm (utc) on July 3, 2007]

turbohost

7:32 pm on Jul 3, 2007 (gmt 0)

10+ Year Member



Thanks TJ

trillianjedi

7:35 pm on Jul 3, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Oh, one word of warning, as I think of it, that function (if it works as written) will give 7 seconds to load the first 20 bytes of a page (no point downloading the whole thing).

If a page is < 20 bytes long, it will give a false negative...

turbohost

7:37 pm on Jul 3, 2007 (gmt 0)

10+ Year Member



I'm getting this error

Warning: file_get_contents() expects at most 2 parameters, 5 given in /home/#*$!x/public_html/backlinks.php on line 27

bcolflesh

7:38 pm on Jul 3, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you have the CURL library installed, you can try this method as well:

[jellyandcustard.com...]

trillianjedi

7:41 pm on Jul 3, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Are you on an old version of PHP? I believe what I wrote is correct for PHP5, but it's off the top of my head. Look that function up in the manual.

If you're on PHP4, just use:-

$a = file_get_contents($url)

The downside is it will download the whole page, which is a waste of bandwidth if you're just checking if something is alive or dead.

You also need to check the manual for $http_response_header. Depending on your version of PHP it may not return an empty string for a 404. It might return full headers that you'll need to explode and check for a 404.

Added:-

Looking at your snippet I guess you're building a backlink checker, in which case you want to download the whole page anyway and RegEx for your domain inside <a href tags and ensure there isn't a no-follow or anything like that in there.

coopster

7:53 pm on Jul 3, 2007 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



And if you don't need the content from the page, just open a socket and send a HEAD as described in Checking if page exists [webmasterworld.com].

trillianjedi

1:35 pm on Jul 4, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Yup, that would be cleaner Coop, although you should still set the socket_timeout if you need to know if the server is down, having had 7 seconds to respond.

turbohost

9:17 pm on Jul 4, 2007 (gmt 0)

10+ Year Member



OK, upgraded to the latest version of PHP and used TJ's code. It works fine but now I have another problem. When the site is not working I get an error message like below. Is there a way I can turn this off?

Warning: file_get_contents() [function.file-get-contents]: php_network_getaddresses: getaddrinfo failed: Temporary failure in name resolution in /home/#*$!/public_html/backlinks2.php on line 13

Warning: file_get_contents(http://example.com) [function.file-get-contents]: failed to open stream: Permission denied in /home/#*$!/public_html/backlinks2.php on line 13

turbohost

9:35 pm on Jul 4, 2007 (gmt 0)

10+ Year Member



OK, found it myself. I added "error_reporting(1);" at the beginning of the script.

Turbo

turbohost

9:54 pm on Jul 4, 2007 (gmt 0)

10+ Year Member



I'm testing the script with a list of sites and the script seems to be slow with inactive sites. Even if I set the timeout at 1 it takes sometimes more than 10 seconds before an echo "Site down" is generated. The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

Turbo

turbohost

6:55 pm on Jul 5, 2007 (gmt 0)

10+ Year Member



Someone? Coopster? TJ?

trillianjedi

9:33 pm on Jul 5, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The timeout reflects to the time between 'open the site' and 'determine if the site is down', correct?

No. That function works at the TCP/IP socket layer, so will be based on a timeout further down the stack than HTTP.

Sounds to me like it's hanging waiting for an ACK. TCP/IP is blocking in nature, so your app is forced to sit and wait for a response (or a timeout from the OS) before it can do anything. The timeout will come from the OS, but the very nature of TCP/IP (and design of it) is such that it is allowed to be "down" for a period of time. So the PHP timeout may not happen until the underlying OS has timed-out and it sounds to me like it's getting around 9 seconds.

It might be a good idea to go multi-threaded, but that would mean coding rather than scripting (I think - I've written anything multi-threaded in PHP so don't even know if it's possible).

coopster

6:57 pm on Jul 6, 2007 (gmt 0)

WebmasterWorld Administrator coopster is a WebmasterWorld Top Contributor of All Time 10+ Year Member



This is where the optional timeout argument to the fsockopen() [php.net] comes in handy as the timeout parameter to fsockopen() only applies while connecting the socket. If you need to set a timeout for reading/writing data over the socket, use stream_set_timeout().

trillianjedi

7:07 pm on Jul 6, 2007 (gmt 0)

WebmasterWorld Senior Member trillianjedi is a WebmasterWorld Top Contributor of All Time 10+ Year Member



The Coopster to the rescue!

That should do the trick for you turbo.

 

Featured Threads

Hot Threads This Week

Hot Threads This Month