Forum Moderators: coopster

Message Too Old, No Replies

strange html pattern from editor

         

omoutop

7:53 am on Feb 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Hello all

I have come face to face with a small problem i can't solve (or find a logic behind it so as to prevent it)

in our admin panel/backend we have a section where some authorized personel write the texts of the site.
Quite often, they copy/paste from MSWord, with unpredictable results (we use some wysiwyg editor for them).
Of course we take any precaution to filter out the bad tags of the msword.

But recently i have noticed a new behaviour.
One text may show something like: <p><br> <br></p>
While the other may be <p><br><br></p>
(note the missing space in the 2nd text)

Since this isn't a common pattern in all texts, how can i prevent it/delete it/replace it/whatever?

mattclayb

2:48 pm on Feb 2, 2010 (gmt 0)

10+ Year Member



Have you tried using strip_tags()? For example:

$HTMLData = "<p>Hello World!</p>";
$CleanData = strip_tags($HTMLData);
echo $CleanData;

//Will output Hello World!

If you are using a wysysig editor such as FCKeditor then you can control what tags are removed in the config settings.

rocknbil

8:53 pm on Feb 2, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If that's the only problem,

$out = preg_replace('/<p><br\s*\/*>\s*<br\s*\/*><\/p>/i','',$out);

Should work for both, \s* is zero or more spaces. There's a little bit inside the break tag there to manage xml style tags too.

omoutop

9:45 am on Feb 3, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



thank you guys but problm solved
i was using the following regex to some function and it caused the problem:
$foo = preg_replace("/\r?\n/", "", $foo);

Although this should only remove the line breaks, it also caused the above problems.
Haven't been able to track back to the "why".. i only confirmed that by removing this regex, all texts so far work properly