Forum Moderators: open

Message Too Old, No Replies

Parsing a HTML with ASP codeblocks

         

wardbekker

10:11 pm on Jan 27, 2003 (gmt 0)

10+ Year Member



I want to parse an HTML page that contains ASP codeblocks (example: <FONT size=2><% =foo %></FONT> ) and create an multi-dimensional array that looks like this:

(0)("<FONT size=2>")
(1)("<% =foo %>")
(2)("</FONT>")

Any idea how this can be done?

hakre

10:31 am on Jan 29, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi wardbekker,

what about if you do a split() on the whole page and use "<%" as delimiter. then loop through the array and split each entry with the delimiter "%>". not you'll get an array of n entries. if n > 1 then the first element of the current array is an asp tag. all you've got to do now is to collect each entry of the current array in to a new one. this array is all you wanted after the loop.

this is written in code what i described. this is not speed optimzed (see the redim()), but does its work. check for syntax first, i did not run it.


dim t as string, v as variant, w as variant
dim a1() as string, a2() as string, a3() as string
dim c as long
t = (string with the whole file content)
a1 = split(t,"<%")
for each v in a1
a2 = split(cstr(v),"%>")
if ubound(a2) > 0 then
a2(0) = "<%" + a2(0) + "%>"
end if
for each w in a2
redim a3(0 to c)
a3(c) = cstr(w)
c = c +1
next
next

tomasz

3:45 pm on Feb 3, 2003 (gmt 0)

10+ Year Member



or ..

dim sHTML as string
dim sOut as string()

sHTML=replace(sHTML,">",">~~")
sOut=split(sHTML,"~~")

line1=sOut(0)
line2=sOut(1)
...

dhirschfeld

5:22 pm on Feb 16, 2003 (gmt 0)

10+ Year Member



Have you considered using Regular Expression? I will run much much faster than a typical parsing routine and in my opinion it is easier to write.

Below is an example of how a regular expression to match fonts would look?

Dim rex as regex = new regex("<font.*?>(?<FontStuff>.*?)</font>")

Then you loop through the 'Group' captures that are labeled FontStuff.