Forum Moderators: open

Message Too Old, No Replies

My Sql

Apostrophe problems.

         

gosman

1:28 pm on Jul 7, 2008 (gmt 0)

10+ Year Member



I have a table that contains a field called Article which is a TEXT field. As the name suggests this field hold's articles for our website. When these articles are displayed on our site if the article contains an apostrophe it is displayed as a small square box. Our page encoding is utf-8.

I thought I would update the database as follows to replace these apostrophes.

update cms set article=replace(article,"'","’")

When I run the above command it just returns 0 rows affected

Anybody any ideas why the apostrophes are not being replaced?

physics

7:32 pm on Jul 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Is your database table encoding also utf-8?

It sounds like the problem might be that the apostrophes aren't properly encoded, so trying to replace ' wouldn't work. So you need to find out what they are.
To identify the character number of the offending entry run a query like this (example, you'll need to alter it for your db):


select
lpad(substring(col1, i, 1), i, " ") as char_i,
i,
ascii(substring(col1, i, 1)) as charcode
from test
cross join integers
where col1 <> "Xaprbs" and i between 1 and length(col1);

[xaprb.com...]
(see link for more explanation of that)

Or you could select the data into a file, isolate the "offending" character (delete the other data) and then do an octal dump. So,


$od -c foo.txt

where foo.txt contains the character.

Once you know what the character number is you can replace it like so:


update cms set article = replace(col1, char(NNN), "'");

Where NNN is the character number of the offending apostrophe-like char.

Demaestro

7:38 pm on Jul 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



Were they copy and pasted in from MS Word?

If so that is the issue. MS Word uses non-ascii, non-standard characters for single and double quotes so that the quotes appear to angle into the words which looks cool and breaks when ever copied and pasted into something that isn't Microsoft.

Those characters can live in the database but when you re-display them in a web browser they get replaced by the browser because they aren't in the standard ascii range.

I see this happen all the time. If this is what is happening it would also explain why your replace isn't working since you are trying to find and replace "real" double and single quotes and it isn't finding them because there are no "real" ones, just "fake" but "cool looking" MS ones.

[edited by: Demaestro at 7:43 pm (utc) on July 7, 2008]