Forum Moderators: coopster

Message Too Old, No Replies

Howto adjust filter settings

         

timmy01

5:03 am on Nov 11, 2010 (gmt 0)

10+ Year Member



hi!
I am trying to filter a load of old static html pages of mine
back into the database.
The script i am using now can only filter 1 tag of each record.

Suppose: page holds 10 records, every record has got 1 tag <b></b>
Simultaneously i want to filter also all thats within the tag <i></i>

Obviously i have to start with adding some extra variables:
$config['start_tagB'] ="<i>";
$config['end_tagB'] ="</i>";

but how to implement this with in the function

<?php

$config['url'] = "http://www.sample.com"; // my domain
$config['start_tag'] = "<b>"; // start
$config['end_tag'] = "</b>"; // end

class grabber
{
var $error = '';
var $html = '';

function grabhtml( $url, $start, $end )
{
$file = file_get_contents( $url );

if( $file )
{
if( preg_match_all( "#$start(.*?)$end#s", $file, $match ) )
{
$this->html = $match;
}
else
{
$this->error = "Tags cannot be found.";
}
}
else
{
$this->error = "Site cannot be found!";
}
}

function strip( $html, $show, $start, $end )
{
if( !$show )
{
$html = str_replace( $start, "", $html );
$html = str_replace( $end, "", $html );

return $html;
}
else
{
return $html;
}
}
}

$grab = new grabber;
$grab->grabhtml( $config['url'], $config['start_tag'], $config['end_tag'] );

echo $grab->error;

foreach( $grab->html[0] as $html )
{
echo htmlspecialchars( $grab->strip( $html, $config['show_tags'], $config['start_tag'], $config['end_tag'] ) ) . "<br>";
}

?>


hope you can help out
as it will safe me lots of time

coopster

8:15 pm on Nov 25, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



You could pass an array of elements that you want stripped and use a regular expression to remove all possible values. Something along the lines of
$pattern = '/<\/?(b|i)>/';

Of course, you would build the pattern tag values dynamically, but that should get your mind started in the right direction.