Welcome to WebmasterWorld Guest from 54.196.208.6

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

Grabbing content

easy question

     
4:38 am on Dec 3, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2003
posts:454
votes: 0


This is a really basic question that I'm sure has been asked countless times in this forum. Someone I know taged part of his web page with tags similar to
<!-- begin content -->
html
<!-- end content -->
how would i use php to grab only this section of html into another file? I'm sure this has been asked before, so a link to the thread would be greatly appreciated.

i tried [webmasterworld.com ] but it didn't work

8:19 pm on Dec 3, 2003 (gmt 0)

Junior Member

10+ Year Member

joined:Feb 26, 2003
posts:50
votes: 0


Hi,

Use something like:

$handle = fopen ("http://www.yourdomainname.biz/file.html", "r");

do {
$data = fread($handle, 8192);
if (strlen($data) == 0) {
break;
}
$contents .= $data;
} while(true);
fclose ($handle);

if(ereg("<!-- begin content -->(.*)<!-- end content -->", $contents, $out)){
echo $out[1];
}
else {
echo "No Match";
}

11:40 pm on Dec 3, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2003
posts:454
votes: 0


this works great for sites that are on my server, but i need to grab content off another site

note:
the site i'm grabbing content from only accepts urls like
[site.com...]
OR
[site.com...]

but with no file extentions

12:20 am on Dec 4, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:July 26, 2002
posts:535
votes: 0


Try a Google search for 'php curl'. I was advised at Pubcon that the Curl library is the business for this kind of activity.
12:32 am on Dec 4, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2003
posts:454
votes: 0


i don't think i can add php libraries
2:30 am on Dec 4, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2003
posts:454
votes: 0


i asked my server support and they said that my server had curl on it. how would i use it in this situation? i don't really understand the php.net explaination
2:59 am on Dec 4, 2003 (gmt 0)

Moderator from GB 

WebmasterWorld Administrator brotherhood_of_lan is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 30, 2002
posts:4842
votes: 1


whos, as per your sticky...this should work, only prob i can see is if there's no <body> tag in the document it there wont be a $page[1] value.

// Change these variables to suit
$pathtocurl = "curl";
$pageyouwanttograb = "http://www.yahoo.com";
$filetowriteto = "writetothisfile.txt";

exec("$pathtocurl $pageyouwanttograb",$page);
if(!count($page))
{
echo 'couldnt get page';
exit;
}
$page = preg_split("'<body[^>]+>'ims",implode("",$page));
$page = $page[1];
$fp = fopen($filetowriteto,"w");
fwrite($fp,$page);
fclose($fp);

It will output "couldnt get page" if you have the wrong path to curl or the url you requested couldnt be reached.

4:42 am on Dec 4, 2003 (gmt 0)

Preferred Member

10+ Year Member

joined:Nov 12, 2003
posts:454
votes: 0


thanks bro...works like a charm