Forum Moderators: coopster

Message Too Old, No Replies

Regular expression help

Regular expression help

         

pbarney

4:57 pm on Apr 19, 2005 (gmt 0)

10+ Year Member



Hi everyone,

I need some help. I'm working on a deadline and I need some help with regular expressions.

I've got a slew of html files with style tags and html formatting that I want to strip. Forunately, NoteTab handled the job well, but I need to turn some urls back into hrefs.

Here is what I the files now contain:

"Here is some sample text <http://www.internet.com/>www.internet.com

How would I structure the reg.exp. to turn that into:

<a href="http://www.internet.com/">www.internet.com</a>

AND here's one for you super-geniuses. If the target is NOT on my server, how would you make it look like this:

<a target="_blank" href="http://www.internet.com/">www.internet.com</a>

Of course, in a perfect world, I'd love to remove only the style and font tags, but I haven't found a tool that will help me do that.

ironik

10:32 pm on Apr 19, 2005 (gmt 0)

10+ Year Member



Maybe something like:


$text = "Here is some sample text <http://www.internet.com/>www.internet.com some more text to test <http://www.test.com/>test";

function convertHrefs($text, $debug = false)
{
$allowProtocol = array('http', 'https', 'ftp', 'wais', 'telnet');
$protocols = implode('¦', $allowProtocol);
$pattern = "/<((" . $protocols . "):\/\/([^>\/]+))\/?>([^>\s]+)/i";
if ($debug)
{
$test = preg_match_all($pattern, $text, $matches);
print_r($matches);
}
return preg_replace($pattern, "<a href=\"$2\" target=\"_blank\">$4</a>", $text);
}

echo '<br />Replacement: ' . convertHrefs($text);

That will work assuming that the text after the < and > chars doesn't contain any spaces or > characters. I put a protocol array in there if you want to disallow protocols from being detected in the replacement (e.g. disallow 'telnet' protocol).

I probably wouldn't use this myself because of the syntax your using. I like the wiki or bbcode style stuff where you have an ending delimiter to define your descriptive name for the link.

coopster

11:30 am on Apr 21, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



On a side note, rather than using Notepad, are you aware of the PHP strip_tags() [php.net] function?