Welcome to WebmasterWorld Guest from

Forum Moderators: coopster & jatar k

Message Too Old, No Replies

regex for finding urls in user input

a working solution, to be used or improved

4:13 pm on Sep 27, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:July 30, 2009
posts: 130
votes: 0

I spent quite a while looking for a good regex to match urls and found no shortage of strict ones, but nothing suitable for processing user input due to two key differences. First, urls written by users is not always in encoded form. And second, urls may be preceded or followed by many types of punctuation, without a separating space. I thought I'd post my solution here in case it could make it easier for somebody else in my position. Also, if you have any suggestions to improve the regex, please post them; but keep in mind that it is purposely not very strict.

I decided on the following parameters:
1) starts with protocol:// or www.
2) contains .[top-level-domain]
3) has at least 1 character between the previous two
4) preceded by anything
5) followed by space or EOF
6) does not include last character if it is punctuation

The regex:
5:40 am on Oct 1, 2010 (gmt 0)

Junior Member

5+ Year Member

joined:Dec 29, 2008
posts: 65
votes: 0


I don't know if you know of a site call "regexpal" it will help you tremendously on your task.

A couple of sample: