Forum Moderators: coopster
I am trying to replace any occurances of a non active bit of a string and making at active.
eg
"the link to www.webmasterworld.com was.."
to
the link to <a href="http://www.webmasterworld.com/forum/3">www.webmasterworld.com/forum3</a> was.."
Are there any functions for the url extraction in this case or do I need to do some regex to strip it.
$string = "the link to www.webmasterworld.com was..";
$pattern = "some pattern for any standard url"
$replacement = "<a href="http:\/\/www.webmasterworld\.com\">www\.webmasterworld\.com</a>";
preg_replace($pattern, $replacement, $string);
I cant see how to get the back reference in there as you would do in a URL rewrite.
Am I even in the right area :)
cheers
$string = "the link to www.webmasterworld.com was..";
$pattern = "/(www.*com)/Uis";
$replacement = "<a href=\"http://$1\">$1</a>";
$string = preg_replace($pattern, $replacement, $string);
The regular expression for $pattern is where you will be spending some time in order to match the possibilities.
s - match all characters including newlines with the dot metacharacter so that if the URL is split over two or more lines, the regex still works.
U - ungreedy. This means it stops at the first "com" which means that I would probably choose to search for
$pattern = "/(www.*\.com)/Uis";
so it stops at the first ".com" - note this will still be a problem with domains like "www.commercial.com". You may wish to stop at the first space if you *know* the URI will be followed by a space, as in
$pattern = "/(www[^\s]*)/Uis";
Tom
I have hacked a bit of code to almost do it. I was wondering if I could get a pointer. I think my regex is to blame.
I get the following error
Warning: Delimiter must not be alphanumeric or backslash in
Cheers
//######### Code
$string = "www.test.com/test/test.htm www.test2.co.uk";
$pattern = "/www/";
$replacement = "http://www";
$string = preg_replace($pattern, $replacement, $string);
print "$string<hr>";
$urls = '(http¦file¦ftp)';
$ltrs = '\w';
$gunk = '/#~:.?+=&%@!\-';
$punc = '.:?\-';
$any = "$ltrs$gunk$punc";
preg_match_all("{
\b
$urls :
[$any] +?
(?=
[$punc] *
[^$any]
¦
$
)
}x", $string, $matches);
foreach ($matches[0] as $u) {
echo "<a href='$u'>$u</a><br>\n";
//Do a preg_replace here for each value of $u in the string
// However it is here where it goes mighty wrong
$pattern = "$u";
$replacement = "<a href=\"$u\">$u</a>";
$string = preg_replace($pattern, $replacement, $string);
}
print "<b>$string</b>";
I probably write 25 regex per day for my regular work (research that involves grepping through text files for variant spellings from 16-th century documents), but most of them are dead easy, so it's fun to get a little beyond that. When it goes way beyond that, I have to call Coopster and Timster... (those guys aren't related are they?).
Tom