Forum Moderators: coopster

Message Too Old, No Replies

Opening other webpages and reading each line into an array

for some sites i can't open it? is there a better function?

         

ryan_b83

2:23 am on Jan 6, 2007 (gmt 0)

10+ Year Member



Hello, here is a snippit of code. I am using this to check links to my site from a list of sites in a database. some addresses will fail when it tries to open the file? Are they blocking access from another site? Or am i doing something wrong?

Thanks,

$sql = "SELECT link FROM table";
$result = mysql_query($sql) or die();
while($row = mysql_fetch_assoc($result)){
unset($lines);

@$lines = file($row['link']);

if($lines){
$link = false;
foreach ($lines as $line_num => $line) {
if(strstr($line,"href='http://www.mysite.com'") ¦¦ strstr($line,"href=\"http://www.mysite.com\"")){
echo "<span style='color:green;font-weight:bold'>LINK EXISTS</span>";
$link = true;
break;
}
}
if(!$link){
echo "<span style='color:red;font-weight:bold'>NO LINK</span>";
}
echo " ==> ".$row['link']."<br>";

}
else{
echo "Could not open ".$row['link']."<br>";
}
}

eelixduppy

2:42 am on Jan 6, 2007 (gmt 0)



There is a slightly better way to do this, IMO:

$sql = "SELECT link FROM table";
$result = mysql_query($sql) or die();
$pattern = "/href=['\"]http:\/\/(www\.)?example\.com['\"]/i";
while($row = mysql_fetch_assoc($result)){
if($file = [url=http://us3.php.net/manual/en/function.file-get-contents.php]file_get_contents[/url]($row['link'])) {
echo '<span style="color:green;font-weight:bold">';
echo ([url=http://us3.php.net/manual/en/function.preg-match.php]preg_match[/url]($pattern,$file))?'LINK EXISTS':'NO LINK';
echo '</span>';
}
echo " ==> ".$row['link'].'<br/>';
}

You might have to check up on my pattern syntax. My brain isn't working well right now ;)

ryan_b83

2:49 am on Jan 6, 2007 (gmt 0)

10+ Year Member



Thank yea thats a better way to do it.

However my main question is, why can i not open some of the pages? Could they be blocked?

eelixduppy

2:52 am on Jan 6, 2007 (gmt 0)



Well it could a couple different reasons. Change the following line:

@$lines = file($row['link']);

To this:


$lines = file($row['link']); //notice I removed error suppression here

This will tell you exactly why you are not getting the file content. I would, however, try to implement a solution similar to mine as having a loop within a loop is messy :)

Sorry I didn't answer your question off the bat. Let me know what you get from this if you need any more help.

Good luck!

ryan_b83

3:07 am on Jan 6, 2007 (gmt 0)

10+ Year Member



It says "failed to open http stream"

eelixduppy

3:10 am on Jan 6, 2007 (gmt 0)



I know this sounds like a stupid question, but do all of the URLs exist? Also does "the world" have access to these pages?

ryan_b83

3:12 am on Jan 6, 2007 (gmt 0)

10+ Year Member



Yes, here is a list of pages that didn't work... i suppressed the errors, and just put "could not open"

[edited by: jatar_k at 3:46 am (utc) on Jan. 6, 2007]
[edit reason] no links thanks [/edit]

eelixduppy

3:54 am on Jan 6, 2007 (gmt 0)



I can confirm that those links can be accessed by PHP assuming your php.ini allows this. Check allow_url_fopen [us3.php.net]. Other than this I cannot tell you why it's not working for you. Maybe you have an error in a part of your script I cannot see? I don't know.

You should try to implement my solution and then debug from there.

Trying something simple in its own php file could tell you too:


$file = file("http://www.example.com"); //make it static just for testing
echo '<pre>';
print_r($file);
echo '</pre>';

alfaguru

5:21 pm on Jan 6, 2007 (gmt 0)

10+ Year Member



Could it be that your hosting provider's firewall is blocking the connections? My provider requires notification of any external sites to be accessed from their servers, to guard against scrapers and bots being used on their service.