Forum Moderators: coopster

Message Too Old, No Replies

PHP IMAP Functions and Character Encoding

Character encoding problems in e-mail messages received through php_imap

         

taylanpince

8:08 pm on Nov 15, 2005 (gmt 0)

10+ Year Member



hello there,

i found the forum while i was researching about some character encoding problems i am having with the webmail application i am developing. i read all the information posted on different topics, and there seems to have been a lot of discussion going on the subject of character encoding. so here goes mine:

i am developing a webmail application for personal use that would give me the chance to access several e-mail accounts at once, instead of logging in to different webmail interfaces. i am using php's imap functions, but i am having trouble figuring out all the different encodings. the system has to be able to show turkish characters (ISO-8859-9), and that's what the html output is using as the main charset.

i got most of the conversions under control (mime headers work fine, utf-8, base64 and quoted printable are properly converted). however, some e-mails maintain characters like =20, or =FD, =FE even after the conversion. Another problem is with e-mails that include quotes from other messages. It seems like the quotations maintain their charsets, and when I convert according to the original message, the quoted message includes strange characters.

i realize that this is a huge issue, and several different solutions have to be found. but i would appreciate any suggestions.

cheers,
taylan

coopster

5:28 pm on Nov 30, 2005 (gmt 0)

WebmasterWorld Administrator 10+ Year Member



Hello taylanpince, welcome to WebmasterWorld.

Not a lot of feedback or dialog here yet. Have you come up with any solutions to date? Which functions are you using for conversion(s)?

taylanpince

6:35 pm on Nov 30, 2005 (gmt 0)

10+ Year Member



the issue is not solved yet, but i didn't have much time to work on the application either.

the problematic messages are always identified as ASCII by php. i am using the multibyte functions, especially mb_convert_encoding and mb_detect_encoding. they seem to be giving the best results so far.

as i said before, i managed to solve all the issues but this one. some messages, usually but not exclusively, the ones received from mailing lists show turkish characters in a "B=C3=BCt=C3=BCnle=C5=9Fik" form. sometimes this problem is only seen in quoted portions of the message, but not always.

i know that squirrelmail can show these mails properly, so there must be a proper way of doing it.

once again, i would be really glad if someone could respond and help me out.

cheers,
taylan