Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Read html reponse using perl regular expression



9:29 am on Jun 16, 2009 (gmt 0)

5+ Year Member

I retrieve an HTML response. I am pasting below the extract of the respone i get

<td rowspan="1" valign="top">
<td rowspan="1" valign="top">2009-06-16 00:26:02

The response is too big, so i just copied the extract of the response . I have to capture the date 2009-06-16 00:26:02 using perl regualr expression, whenever the status changes from AAA to BBB.
In the whole response status AAA is changed to BBB only once.

i tried with
= ~ m/^(.*)<td>AAA<\/td><td>BBB<\/td><\/tr>/i

but for no use

Please help


12:52 pm on Jun 16, 2009 (gmt 0)

5+ Year Member

Looks like since they HTML response is multiple lines, your regex is probably not working because it isn't looking for the newline characters.

You could either try to put in a match for the newlines (\n) or add the 's' modifier with your 'i' modifier at the end, then use the 'match anything' . to catch the newline. Check out this tutorial, it explains this: [anaesthetist.com...]


1:08 pm on Jun 16, 2009 (gmt 0)

5+ Year Member

hmm i tried but its not working.


2:46 pm on Jun 16, 2009 (gmt 0)

5+ Year Member

Got a sample of what you are trying now?


7:49 am on Jun 17, 2009 (gmt 0)

WebmasterWorld Senior Member 5+ Year Member

$string =~ !<td rowspan="1" valign="top">([^<]+)\s*</td>\s*<td>\s*Status\s*</td>\s*<td>AAA\s*</td>\s*<td>BBB\s*</td>!gis

will help. \s is a whitespace, thus it matches linebreaks, tabs or spaces (and more? my memory is bad).


5:02 pm on Jun 17, 2009 (gmt 0)

5+ Year Member

Thanks everyone
below code worked for me


[edited by: PankajBansal at 5:03 pm (utc) on June 17, 2009]


7:09 pm on Jul 19, 2009 (gmt 0)

5+ Year Member

Good solution is to use HTML::TreeBuilder::XPath

Featured Threads

Hot Threads This Week

Hot Threads This Month