| Welcome to WebmasterWorld Guest from 22.214.171.124 |
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
|Pubcon Platinum Sponsor 2014|
|Extract urls from string|
perl, url, extract
| 10:22 am on May 30, 2004 (gmt 0)|
I'm hoping someone can help me here. I'm trying to extract a list of urls from a string.
Remembering that urls could be typed "http://blahblah.com" or "www.blahblah.com".
Here is what I have so far.
my @links = $textString =~ m#((www\.¦http://)[^\s<"']+)#gm;
It seems to be working fine in most scenarios except that I always get an extra www. or http:// because of the parenthesis around the "or" condition.
Any help would be greatly appreciated.
| 8:48 pm on May 30, 2004 (gmt 0)|
I suppose you could make the http optional, and also use \b to make it easier to find word boundaries:
| 4:34 am on May 31, 2004 (gmt 0)|
Writing a regular expression to find URLs is like finding the Holy Grail: Ain't gonna happen, because it doesn't exist.
It's time to learn about Perl modules: URI-Find-0.13 [search.cpan.org]
| 10:56 am on May 31, 2004 (gmt 0)|
Thanks very much for your help :)
All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved