Forum Moderators: open

Message Too Old, No Replies

ASP pattern matching

         

shakes911

9:58 am on Jul 18, 2001 (gmt 0)



Does anyone have a pattern matching script to extract a HTML link, i.e. <a href=.....>link text</a> while retaining the link text?

thanks

sugarkane

7:59 am on Jul 19, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Shakes, do you want to extract the URL from the link or just the link text?

I've been trying to get some info on how ASP deals with pattern matching, but with little success.

I can only find examples of using the 'islike' function, where it returns true or false for a matched pattern - do you know if it supports substitution etc?

shakes911

8:43 am on Jul 19, 2001 (gmt 0)



I only wish to retain the actual link text and rid myself of the URL. I found this script on 4guysfromrolla.com but it just doesn't seem to work for me no matter how i tweak it.

'Assume strHTML contains the HTML with the <a href="URL">URL Description</a>
'We want to store into strText the HTML in strHTML, but with the HREF tags
'changed to a more text-friendly URL Description [URL]

'First, create a reg exp object
Dim objRegExp
Set objRegExp = New RegExp

objRegExp.IgnoreCase = True
objRegExp.Global = True
objRegExp.Pattern = "<a\s+href=""http://(.*?)"">\s*((\n¦.)+?)\s*</a>"

'Now, replace the HREF tags with our preferred format
strText = objRegExp.Replace(strHTML, "$2 [http://$1]")

shakes911

8:46 am on Jul 19, 2001 (gmt 0)



I think you can use the '%' sign as a regular expression to symbolise any text of zero or more characters, like the * symbol in DOS.

sugarkane

7:24 pm on Jul 19, 2001 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



A general regex would be <a\s+.*?>(.*?)</a\s*>

I don't have access to an ASP box, so I can't test this, but it looks like this should work:

objRegExp.Pattern = "<a\s+.*?>(.*?)</a\s*>"
strText = objRegExp.Replace(strHTML, "$1")

shakes911

2:41 pm on Jul 20, 2001 (gmt 0)



thanks man.