Forum Moderators: coopster

Message Too Old, No Replies

Screen Scraping Help

pulling info from another site using php.

         

DappaDan

2:42 pm on Aug 13, 2004 (gmt 0)

10+ Year Member



Hello,

I need to find out how i can screen scrape the top ten virus threats from a website, store them in an txt/xml pref xml and the from there i can display them where ever on my site. I know people like NAI.com offer something you can put on your site but you have to have there logo and there layout which i dont want.

I know you can do it with asp but i hate asp so i would like to know how i can do it with php bearing in mind im not exactly hot on it.

Any help would be great.

Cheers

Dappa.

WhosAWhata

3:25 pm on Aug 13, 2004 (gmt 0)

10+ Year Member



look at cURL and preg_match

mattx17

3:49 pm on Aug 13, 2004 (gmt 0)

10+ Year Member



Here's a quick and dirty example of how I do it:

$URL = "http://www.somesite.com/page.html";

$Contents = file_get_contents($URL);

$Lines = preg_split("/\n/",$Contents);

foreach($Lines as $Line)
{
if (preg_match("/alert (.*?)$/",$Line,$Match)) // EXAMPLE ONLY!
{
print("Found match (" . $Match[1] . ")<br />\n");
}
}

Obviously you have to figure out what would match, and capture it.

Other helpful links:

[php.net...]
[php.net...]

Hope this helps!

DappaDan

4:07 pm on Aug 13, 2004 (gmt 0)

10+ Year Member



Thanks mattx17,

So say i wanted to capture the first 10 newest viruses listed here:
[securityresponse.symantec.com...]

how would i do that? do i not have to use that curl stuff then?

Thanks for your help

WebNeeds

4:22 pm on Aug 13, 2004 (gmt 0)

10+ Year Member



use cURL to get the file into a variable (do some searching on this site for cURL) the use preg_match to find exactly the piece you want