Welcome to WebmasterWorld Guest from 18.208.211.150

Forum Moderators: bill

Message Too Old, No Replies

Multi-language questions

     
10:40 am on Mar 5, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2005
posts:636
votes: 0


I've long been mulling over the issues of translating my site into various Asian languages and I've now decided to make some test pages in Korean, Japanese, Chinese and Thai.

I'm ploughing through the mindnumbingly complex issues of character enconding yadda yadda and I'm putting together the Korean test page but it's made even more complicated by the way I serve the headers. Basically, all my pages use the .xhtml extension but I serve the html/text MIME type. In order to make changes to this in the future very easy, the headers are called on each page by a PHP include to a single small file that contains the header information... and that looks like this for the Korean test page...

<?php
$charset = "euc-kr";
$mime = "text/html";
header("Content-Type: $mime;charset=$charset");
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="ko">

... How does this look? Is there anything I've left out, or need to be aware of?

11:32 am on Mar 5, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2005
posts:636
votes: 0


Oh boy!... now I'm really confused :-(

I use WinSCP but it won't allow me to edit documents in Korean and it all comes out as a load of question marks.

What interface should I be using for this? Should I be saving documents in Unicode or UTF-8?...

I'm so confused :msn-cry:

2:18 pm on Mar 5, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2005
posts:636
votes: 0


I'm also getting the following PHP error at the top of the page (the header info being called in a PHP include right at the top of the page)...

Warning: Cannot modify header information - headers already sent by (output started at /homepages/yadda/yadda/my-korean-page.xhtml:1) in /homepages/yadda/yadda/my-korean-page.php on line 4

I also noticed that even though the doctype declaration is making it and being rendered in the page, the w3 validator says there's no doctype found.

What does all this mean? What on Earth am I doing wrong here?!

1:56 am on Mar 6, 2008 (gmt 0)

Administrator from JP 

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Oct 12, 2000
posts:15136
votes: 167


.xhtml extension

I'm not sure why you're using this non-standard extension, but as long as your server is configured properly to serve pages this way then you can use any extension you want. It's not standard and may confuse users who manually input URLs though.

xhtml11.dtd

Why are you using XHTML 1.1? Have you ever read Why most of us should NOT use XHTML [webmasterworld.com]? If you must use XHTML then 1.0 Strict or Transitional is the suggested route. Even the W3C doesn't use XHTML 1.1. Are you certain you're using it properly? XHTML 1.0 and 1.1 are not the same thing.
8:09 am on Mar 6, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2005
posts:636
votes: 0


Honestly, no I'm not entirely certain. I've spent a great deal of time trying to educate myself about properly operating websites and I've come to the conclusion that it's a waste of time seeking the 'standard', what hope does a semi-pro like me have of getting it right when there apparently is no right.

I'm a part time webmaster but my website is starting to take off in my sector. A major travel guide print publisher is about to run full page print ads in their Japan and Malaysia guides and it's something that could really make my website... but until such time, I don't have the resources to make mistake after mistake. I know I can learn from mistakes but what I really need is someone to tell me what to do because I can't see the fire for the smoke.

I don't want to get into a debate over the pros and cons of serving XHTML with the text/html MIME type but that's the route I've chosen and I hope in years to come the standard will be adopted. If not, I've got a lot of work to do to reverse my decision. If all I need to do is change the MIME type in the future, it's all in an include and I can change the lot in minutes.

Regardless, I'm having big problems understanding the language issues and I can't get these Korean pages to work properly. If I save in Unicode, the PHP includes won't work. If I save in UTF-8, Firefox won't display the characters properly. I've I save in ANSI, Korean is just question marks... I need someone to draw me a picture because I'll be f#$%&^d if I can work it out this time.

8:45 am on Mar 6, 2008 (gmt 0)

Administrator from JP 

WebmasterWorld Administrator bill is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Oct 12, 2000
posts:15136
votes: 167


I'm guessing that the way you're using XHTML 1.1 is a big part of the problem.

I don't want to get into a debate over the pros and cons of serving XHTML with the text/html MIME type but that's the route I've chosen and I hope in years to come the standard will be adopted.

XHTML 1.1 was never an upgrade from 1.0. A lot of people made that mistake. The XHTML 1.1 route has pretty much been abandoned by most web developers as it's problematic. If you're going to use that standard then you've really got to know what you're doing. It's not for the semi-pro.

language issues

Can your code editor actually properly handle/produce these languages in the format that you are setting as the character encoding in the page? You can't just set the page encoding to UTF-8 in the page header. You've actually got to put properly formatted UTF-8 content in there. In your example you use euc-kr for the charset. Do you have euc-kr encoded content in the page?
4:45 pm on Mar 6, 2008 (gmt 0)

Moderator from US 

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Dec 10, 2005
posts:5811
votes: 157


As someone who has been doing Asian language web sites (Korean, Japanese, Chinese) for 10+ years, I can certainly understand your frustration! I don't pretend to understand all the nuances, so I try to keep things as simple as possible. Use ASCII text editors as much as possible.

that's the route I've chosen and I hope in years to come the standard will be adopted.

Perhaps not the best decision... First adoption of a new, non-standard technology often involves major struggles. And what do you do if the technology is never accepted as a standard?
6:43 pm on Mar 8, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:Dec 7, 2005
posts:636
votes: 0


Thanks for the comments guys but I've decided I don't have the time for this just now. I'm heading over to Burma next week and I've got too much on my plate, so this is getting shelved for now... I'll dig up this thread again when I can justify spending the time on it.