Forum Moderators: coopster
You may or may not find this thread on Unicode Support [webmasterworld.com] useful (depending on whether you have the conversion issues solved already or not).
Tom
[edited by: ergophobe at 6:59 pm (utc) on Oct. 15, 2004]
There are also standalone programs that will convert encodings. You could do an SQL dump, convert the text file with a standalone, and upload it to your DB.
Honestly, though, what's going to take the time is just going through the steps. The 30 seconds it takes to do the character conversion is nothing.
But I WILL have to do what you specified at some point :-(.
And dumping the index is definitely NOT an option, re-indexing on this sissy of a server takes AGES.
Also, 800/9!= 12 lol
<?php
$str = file_get_contents($_GET['url']);
echo "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n";
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN">
<html xml:lang="en">
<head>
<title></title>
</head>
<body>
<?php
echo htmlentities(mb_convert_encoding($str, 'ISO-8859-1', mb_detect_encoding($str)));
?>
</body>
</html>
Is mb_convert_encoding($str, 'ISO-8859-1', mb_detect_encoding($str)) similar to what you used?
Also, 800/9!= 12 lol
What's an order of magnitude and a 25% error between friends? Somehow I was not looking at your post and was remembering different numbers when I wrote that.
Is mb_convert_encoding($str, 'ISO-8859-1', mb_detect_encoding($str)) similar to what you used?
No, for all the reasons outlined in the my long post in the thread I referenced.
You're trying to figure out the encoding of form output that is being sent to you as ISO-8859-1 (because that's what your webpage sends as) but, unfortunately, being posted into the form as something else.
You have to make sure that ISO-8859-1 is not a viable encoding to detect, otherwise you will not get any conversion. So you need to have some idea what encodings might possibly come in, and make a list of encodings that you will test for, none of which is iso-8859-1 and tell PHP about that list using mb_detect_order() otherwise it won't work at all.
Look at my messages in the Unicode thread I mentioned previously for more detail.
[edited by: ergophobe at 4:26 pm (utc) on Oct. 16, 2004]