Forum Moderators: coopster

Message Too Old, No Replies

htmlentities(html entity decode()) situation?

Some data in MySQL encoded, some not

         

salewit

12:06 am on Aug 27, 2007 (gmt 0)

10+ Year Member



I have a rather large database where the data in it was started early on with characters encoded (i.e. $amp; instead of &). After a few months it got too tedious and the data was just dumped in raw, so about 10% of it is encoded.

Anyway, when I read in my data, I'm doing this:

$row=mysql_fetch_assoc($resultID);
$var = htmlentities(html_entity_decode($row[whatever]));

It seems to work. The question is, how do I do this for ALL the fields in a table? I guess what I want to do is the equivalent of:

$row=htmlentities(html_entity_decode(mysql_fetch_assoc($resultID)));

This of course gives me errors. Is there an easier way? Am I going about things wrong?

Habtom

5:13 am on Aug 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yea, you have to do it for each but you can create a function though.

function my_html($myrow){
$neat = htmlentities(html_entity_decode($myrow))
return $neat;
}

... and you can use it as follows:

$var = my_html($row[whatever]);
$var2 = my_html($row[whatever2]);

Habtom

salewit

5:46 am on Aug 27, 2007 (gmt 0)

10+ Year Member



Thank you so much. I will definitely use that. But now I've discovered a bigger problem. I have a database of 400 different product descriptions. Here's an abbrviated example of one text field:

You'll follow the TR&O 1522 ...... and more.<p>On the run out to ..<br>On DVD

There's some HTML in there, and an & symbol. If I use htmlentities(html_entity_decode()), I lose the <p> and the <br> while fixing the & symbol, no?

My big question here is... Do I store text in a database totally encoded? If not, how do I retrieve paragraphs or line breaks? Should I go through the database and clean everything to one method or another? Any good articles out on this? I did some searching just now and I'm not really finding the right subject matter.

vincevincevince

5:57 am on Aug 27, 2007 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If you are storing HTML source then you must store full entities with it (&amp; etc.)

I suggest that you fix everything first!

My preferred method of fixing such a thing is this:


$dh=mysql_query("SELECT `id`,`text` FROM `articles`");
while ($r=mysql_fetch_assoc($dh))
{
// first, break &amp; symbols to avoid &amp;amp;
$r[text]=str_replace("&amp;","&",$r[text]);
// next, break correct entities to avoid &amp;copy; etc.
$r[text]=str_replace("&pounds;","#",$r[text]);
$r[text]=str_replace("&copy;","©",$r[text]);
...etc...

//now, fix everything, &amp; first:
$r[text]=str_replace("&amp;","&",$r[text]);
$r[text]=str_replace("#","&pounds;",$r[text]);
$r[text]=str_replace("©","&copy;",$r[text]);
...etc...

mysql_query("UPDATE `articles` SET `text` = '".mysql_real_escape_string($r[text])."' WHERE `id` = $r[id]");
}