homepage Welcome to WebmasterWorld Guest from 23.20.63.27
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Pubcon Platinum Sponsor 2014
Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
Forum Library, Charter, Moderators: coopster & jatar k & phranque

Perl Server Side CGI Scripting Forum

    
Extract urls from string
perl, url, extract
KeithBoynton




msg:445106
 10:22 am on May 30, 2004 (gmt 0)

Hello,

I'm hoping someone can help me here. I'm trying to extract a list of urls from a string.

Remembering that urls could be typed "http://blahblah.com" or "www.blahblah.com".

Here is what I have so far.


my @links = $textString =~ m#((www\.¦http://)[^\s<"']+)#gm;

It seems to be working fine in most scenarios except that I always get an extra www. or http:// because of the parenthesis around the "or" condition.

Any help would be greatly appreciated.

 

SeanW




msg:445107
 8:48 pm on May 30, 2004 (gmt 0)

I suppose you could make the http optional, and also use \b to make it easier to find word boundaries:

m/\b((http://)?www.*?)\b/mg

Sean

mbauser2




msg:445108
 4:34 am on May 31, 2004 (gmt 0)

Writing a regular expression to find URLs is like finding the Holy Grail: Ain't gonna happen, because it doesn't exist.

It's time to learn about Perl modules: URI-Find-0.13 [search.cpan.org]

KeithBoynton




msg:445109
 10:56 am on May 31, 2004 (gmt 0)

Thanks very much for your help :)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / Perl Server Side CGI Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved