Read html reponse using perl regular expression

Forum Moderators: coopster & phranque

Message Too Old, No Replies

Read html reponse using perl regular expression

PankajBansal

9:29 am on Jun 16, 2009 (gmt 0)

I retrieve an HTML response. I am pasting below the extract of the respone i get

<tr>
<td rowspan="1" valign="top">
</td>
<td rowspan="1" valign="top">2009-06-16 00:26:02
</td>
<td>
Status
</td>
<td>AAA
</td>
<td>BBB
</td>
</tr>

The response is too big, so i just copied the extract of the response . I have to capture the date 2009-06-16 00:26:02 using perl regualr expression, whenever the status changes from AAA to BBB.
In the whole response status AAA is changed to BBB only once.

i tried with
$htmlResponse
= ~ m/^(.*)<td>AAA<\/td><td>BBB<\/td><\/tr>/i

but for no use

Please help

mattdw

12:52 pm on Jun 16, 2009 (gmt 0)

Looks like since they HTML response is multiple lines, your regex is probably not working because it isn't looking for the newline characters.

You could either try to put in a match for the newlines (\n) or add the 's' modifier with your 'i' modifier at the end, then use the 'match anything' . to catch the newline. Check out this tutorial, it explains this: [anaesthetist.com...]

PankajBansal

1:08 pm on Jun 16, 2009 (gmt 0)

hmm i tried but its not working.

mattdw

2:46 pm on Jun 16, 2009 (gmt 0)

Got a sample of what you are trying now?

janharders

7:49 am on Jun 17, 2009 (gmt 0)

maybe
$string =~ !<td rowspan="1" valign="top">([^<]+)\s*</td>\s*<td>\s*Status\s*</td>\s*<td>AAA\s*</td>\s*<td>BBB\s*</td>!gis

will help. \s is a whitespace, thus it matches linebreaks, tabs or spaces (and more? my memory is bad).

PankajBansal

5:02 pm on Jun 17, 2009 (gmt 0)

Thanks everyone
below code worked for me

$htmlResponse=~m/(.*)\s*<\/td>\s*<td>\s*Status\s*<\/td>\s*<td>AAA\s*<\/td>\s*<td>\s*BBB\s*<\/td>/igs

[edited by: PankajBansal at 5:03 pm (utc) on June 17, 2009]

chorny

7:09 pm on Jul 19, 2009 (gmt 0)

Good solution is to use HTML::TreeBuilder::XPath