Welcome to WebmasterWorld Guest from 54.196.144.242

Forum Moderators: coopster & jatar k & phranque

Message Too Old, No Replies

Read html reponse using perl regular expression

     
9:29 am on Jun 16, 2009 (gmt 0)

New User

5+ Year Member

joined:June 7, 2009
posts:13
votes: 0


I retrieve an HTML response. I am pasting below the extract of the respone i get

<tr>
<td rowspan="1" valign="top">
</td>
<td rowspan="1" valign="top">2009-06-16 00:26:02
</td>
<td>
Status
</td>
<td>AAA
</td>
<td>BBB
</td>
</tr>

The response is too big, so i just copied the extract of the response . I have to capture the date 2009-06-16 00:26:02 using perl regualr expression, whenever the status changes from AAA to BBB.
In the whole response status AAA is changed to BBB only once.

i tried with
$htmlResponse
= ~ m/^(.*)<td>AAA<\/td><td>BBB<\/td><\/tr>/i

but for no use

Please help

12:52 pm on June 16, 2009 (gmt 0)

New User

5+ Year Member

joined:June 11, 2009
posts:27
votes: 0


Looks like since they HTML response is multiple lines, your regex is probably not working because it isn't looking for the newline characters.

You could either try to put in a match for the newlines (\n) or add the 's' modifier with your 'i' modifier at the end, then use the 'match anything' . to catch the newline. Check out this tutorial, it explains this: [anaesthetist.com...]

1:08 pm on June 16, 2009 (gmt 0)

New User

5+ Year Member

joined:June 7, 2009
posts:13
votes: 0


hmm i tried but its not working.
2:46 pm on June 16, 2009 (gmt 0)

New User

5+ Year Member

joined:June 11, 2009
posts:27
votes: 0


Got a sample of what you are trying now?
7:49 am on June 17, 2009 (gmt 0)

Senior Member

WebmasterWorld Senior Member 5+ Year Member

joined:May 31, 2008
posts:661
votes: 0


maybe
$string =~ !<td rowspan="1" valign="top">([^<]+)\s*</td>\s*<td>\s*Status\s*</td>\s*<td>AAA\s*</td>\s*<td>BBB\s*</td>!gis

will help. \s is a whitespace, thus it matches linebreaks, tabs or spaces (and more? my memory is bad).

5:02 pm on June 17, 2009 (gmt 0)

New User

5+ Year Member

joined:June 7, 2009
posts:13
votes: 0


Thanks everyone
below code worked for me

$htmlResponse=~m/(.*)\s*<\/td>\s*<td>\s*Status\s*<\/td>\s*<td>AAA\s*<\/td>\s*<td>\s*BBB\s*<\/td>/igs

[edited by: PankajBansal at 5:03 pm (utc) on June 17, 2009]

7:09 pm on July 19, 2009 (gmt 0)

Junior Member

5+ Year Member

joined:May 8, 2008
posts: 74
votes: 0


Good solution is to use HTML::TreeBuilder::XPath