Forum Moderators: coopster
I'm hitting my head on a wall trying to parse a html page.
Any help in this regard would be greatly welcomed.
I need to write a php script which will do this....
Parse any html page line by line.wherever it finds text, it will extract the text and store it in a different variable (array or something) and replace it with a unique token.
say if my html page is something like this
$page_content = "<html>
<title>
My Page
</title>
<body>
<div>
Hello!
</div>
<div>
Its a beautiful world
</div>
</body>
</html>";
it should output to me two things
First the original html but texts replaced with tokens and the array of token=>strings map
$new_page_content = "<html>
<title>
TOK_TITLE_1
</title>
<body>
<div>
TOK_DIV_1
</div>
<div>
TOK_DIV_2
</div>
</body>
</html>"
$token_strings_array = array{
'TOK_TITLE_1' => "My Page",
'TOK_DIV_1' => "Hello"!,
'TOK_DIV_2' => "Its a beautiful world"
}
What could be the best way to do this.
Is there any standard libraries/ classes ..that I could possible use?
Need help on this asap !
I didn't test this out but it should get you going...
<?
//set array first
$token_strings_array = array(
'TOK_TITLE_1' => "My Page",
'TOK_DIV_1' => "Hello!",
'TOK_DIV_2' => "Its a beautiful world"
);
var_dump($token_strings_array); //dump 1
print_r($token_strings_array); //dump 2
$dump = "";
foreach($token_strings_array as $value=>$key): //dump 3
$dump .= "<li>$value=>$key";
endforeach;
//use conent of array in web page
$new_page_content = "<html>
<title>
$token_strings_array[TOK_TITLE_1]
</title>
<body>
<div>
$token_strings_array[TOK_DIV_1]
</div>
<div>
$token_strings_array[TOK_DIV_2]
</div>
<div>
$dump
</div>
</body>
</html>";
print $new_page_content;
?>