Forum Moderators: coopster
I'm not sure where to go from here to be able to get the link as well as the information.
Can anyone help me out?
Here is the code
<?php
include("connect.php");$data = file_get_contents('http://example.com/new/export/?AF28_36/export/all_cappers');
$pat = '#<div id="service_promo">\s*<table cellpadding="%%CELLPADDING%%" cellspacing="%%CELLSPACING%%" border="0">\s*<tr class="promo_header">\s*<th>\s*<b><a [^>]*>([^<]*)</a></b><br>\s*</th>\s*</tr>\s*<tr>\s*<td class="description" valign="top">\s*([^<]*)\s*</td>#';
preg_match_all($pat,$data,$ma);
$out = array_combine($ma[1],$ma[2]);
foreach($out as $key => $val)
{
$sql = "INSERT INTO handicappers(name, content) VALUES(
'$key',
'$val');";
$result = mysql_query($sql);
}
print_r($out);
?>
Thanks in advance
[edited by: eelixduppy at 12:23 am (utc) on Aug. 13, 2009]
[edit reason] exemplified [/edit]
It grabs everything EXCEPT the links.
Less than that even. Looking at the tail end of your pattern here we can see the opening <td> tag's closure followed by zero or more space characters ...
>\s*([^<]*)\s*</td>#
... and then the capturing subpattern which is looking for the text in this node only, not elements. The opening tag boundary marker (the less than sign) within the class is telling your pattern to match zero or more of anything that is not an opening element tag boundary. Therefore, don't capture any html, not just anchor elements (links).
You need to modify that portion of your regular expression.