Grabbing content

Forum Moderators: coopster

Message Too Old, No Replies

Grabbing content

easy question

WhosAWhata

4:38 am on Dec 3, 2003 (gmt 0)

This is a really basic question that I'm sure has been asked countless times in this forum. Someone I know taged part of his web page with tags similar to

html

how would i use php to grab only this section of html into another file? I'm sure this has been asked before, so a link to the thread would be greatly appreciated.

i tried [webmasterworld.com ] but it didn't work

mogwai

8:19 pm on Dec 3, 2003 (gmt 0)

Hi,

Use something like:

$handle = fopen ("http://www.yourdomainname.biz/file.html", "r");

do {
$data = fread($handle, 8192);
if (strlen($data) == 0) {
break;
}
$contents .= $data;
} while(true);
fclose ($handle);

if(ereg("(.*)", $contents, $out)){
echo $out[1];
}
else {
echo "No Match";
}

WhosAWhata

11:40 pm on Dec 3, 2003 (gmt 0)

this works great for sites that are on my server, but i need to grab content off another site

note:
the site i'm grabbing content from only accepts urls like
[site.com...]
OR
[site.com...]

but with no file extentions

jetboy_70

12:20 am on Dec 4, 2003 (gmt 0)

Try a Google search for 'php curl'. I was advised at Pubcon that the Curl library is the business for this kind of activity.

WhosAWhata

12:32 am on Dec 4, 2003 (gmt 0)

i don't think i can add php libraries

WhosAWhata

2:30 am on Dec 4, 2003 (gmt 0)

i asked my server support and they said that my server had curl on it. how would i use it in this situation? i don't really understand the php.net explaination

brotherhood of LAN

2:59 am on Dec 4, 2003 (gmt 0)

whos, as per your sticky...this should work, only prob i can see is if there's no <body> tag in the document it there wont be a $page[1] value.

// Change these variables to suit
$pathtocurl = "curl";
$pageyouwanttograb = "http://www.yahoo.com";
$filetowriteto = "writetothisfile.txt";
exec("$pathtocurl $pageyouwanttograb",$page);
if(!count($page))
{
echo 'couldnt get page';
exit;
}
$page = preg_split("'<body[^>]+>'ims",implode("",$page));
$page = $page[1];
$fp = fopen($filetowriteto,"w");
fwrite($fp,$page);
fclose($fp);

It will output "couldnt get page" if you have the wrong path to curl or the url you requested couldnt be reached.

WhosAWhata

4:42 am on Dec 4, 2003 (gmt 0)

thanks bro...works like a charm