homepage Welcome to WebmasterWorld Guest from 54.204.249.184
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
Forum Library, Charter, Moderators: coopster & jatar k

PHP Server Side Scripting Forum

    
regex for finding urls in user input
a working solution, to be used or improved
Skier88




msg:4207568
 4:13 pm on Sep 27, 2010 (gmt 0)

I spent quite a while looking for a good regex to match urls and found no shortage of strict ones, but nothing suitable for processing user input due to two key differences. First, urls written by users is not always in encoded form. And second, urls may be preceded or followed by many types of punctuation, without a separating space. I thought I'd post my solution here in case it could make it easier for somebody else in my position. Also, if you have any suggestions to improve the regex, please post them; but keep in mind that it is purposely not very strict.

I decided on the following parameters:
1) starts with protocol:// or www.
2) contains .[top-level-domain]
3) has at least 1 character between the previous two
4) preceded by anything
5) followed by space or EOF
6) does not include last character if it is punctuation

The regex:
%(([A-Za-z]{3,5})://|www\.)\S+?\.[A-Za-z]{2,4}.*?(?=[\.,:;]?(\s|$))%

 

redhatlab




msg:4209743
 5:40 am on Oct 1, 2010 (gmt 0)

Hi,

I don't know if you know of a site call "regexpal" it will help you tremendously on your task.

A couple of sample:

^(http|ftp)://(www\.)?.+\.(com|net|org)$

and


'/^(http|https|ftp):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6}'.'((:[0-9]{1,5})?\/.*)?$/i'
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Code, Content, and Presentation / PHP Server Side Scripting
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved