Forum Moderators: coopster

Message Too Old, No Replies

regex help

unable to fetch out text from html

         

phparion

6:35 am on Nov 20, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hi

I have a html output which is long and tables' net. The chunk of html, I am interested in, looks like below,

1 - it starts with a font tag as

<font color="#000000">

2 - then there is a lot of html including <br> tags, alphabets, digits, hyperlinks, + sign and other symbols.

3 - My interested code ends on a word "rollnumber"

So basically I want to fetch all the code between

<font color="#000000">[anything here]rollnumber

for this I have tried regex as below,

preg_match_all("/color=\"#000000\">[\/\(\)-:<>\w\s]+rollnumber/i",$cnt,$mat);

but the $mat received from this regex is an empty array.

Can anybody help please?

thank you.

adb64

7:14 am on Nov 20, 2007 (gmt 0)

10+ Year Member



Hi phparion,

Maybe the following regex will do the job for you:

preg_match_all("/color=\"#000000\">(.+)rollnumber/iU",$cnt,$mat);

You have to use parenthesis around the expressions you want returned.
I also added the U modifier [php.net] to make it ungreedy.

Regards,
Arjan

ayushchd

11:15 am on Nov 20, 2007 (gmt 0)

10+ Year Member



This should give you everything between <font color="#000000"> and rollnumber :

$start_indicate = '<font color="#000000">';
$contents = substr($file_contents, strpos($file_contents, $start_indicate) + strlen($start_indicate));
$end_indicate = "rollnumber";
$contents = substr($contents, 0, strpos($contents, $end_indicate));

$file_contents will have the ENTIRE output.

Ayush

phparion

4:44 am on Nov 21, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thank you very much for your replies. However the solution was very simple for me. I was using $cnt while it was $rawHTML which had the target HTML output :D . silly me....

anyway thanks again