Forum Moderators: open

Message Too Old, No Replies

To view Japanese source code

Difficulties

         

wolfy

11:51 am on Feb 19, 2003 (gmt 0)

10+ Year Member



I've this problem:

I found a web site and I really would like to view their source code ( site is just in HTML ) , unfortunately I can't read the Title, Meta Tags etcc.. because I see only numbers. The site is in Japanese but I've tried to view the source both on a normal PC and on a Japanese OS and nothing to do. I downloaded all the possible windows updates and I also download the web site and tried to put inside a charset but still nothing to do ..

Any suggestion?

Damian

12:21 pm on Feb 19, 2003 (gmt 0)

10+ Year Member



Where you able to determine for sure the problem is caused by the Japanese characters? Maybe the source is protected (encoded) by something like Weblock or Html Guard?

takagi

3:59 pm on Feb 19, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello Wolfy,

Stickymail me the URL, and I will give it a try.

Takagi.

tedster

5:47 pm on Feb 19, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Be sure to let us know if you find out what's going on - I know that I'm curious about this one. Thanks in advance.

bill

8:35 am on Feb 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



I'm a little late on this one wolfy but I'll take a look if you want...

wolfy

1:24 pm on Feb 20, 2003 (gmt 0)

10+ Year Member



Thanks to everyone,
I'll let you know if I solved or not the problem.

takagi, bill see your sticky mail
I'll give a reward to the faster in solving the problem :)

takagi

2:50 pm on Feb 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Hello Wolfy,

The numbers you saw are Unicode numbers.

Usually Japanese pages are in Shift-JIS (charset=Shift_JIS), where 2 bytes are used for one character. A similar method is used when pages are written in Unicode (charset=utf-8), which also makes symbols from other languages possible like Korean, Thai, Hebrew, Arabic, Russian, etc.

The page you mentioned has a special way to show a Japanese string. For each character, &#<5-digit-code>; is used. This way each character needs 8 bytes. But every plain ASCII editor can be used to enter the code (if you know the code). And when copy/pasting, no unwanted conversion is done. This coding is similar to the copyright symbol that can be coded as '&copy;' or as '&169;' .

To take the title-tag; the first 16 bytes are: &#28023;&#12398;

But on a PC that can visualize Japanese characters, this is automatically converted.

The first character in the title has the meaning: sea and has the code 28023. The second character is a hiragana 'no' and in Unicode the number is 12398.

I hope my explanation is good enough. But if you need more information, please let me know by a message in this thread or by stickymail.

bill

12:51 am on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



It is obvious that takagi sleeps a lot less than I do ;)

He is right that the page is encoded in UTF-8 Unicode. It can be really difficult to look up all the codes so I suggest you cut and past the source into the tool at the bottom half of this page [kanzaki.com]. That page is in Japanese but has some useful tools for when you get unreadable JIS and Unicode text in your e-mail or source viewer. It will convert the gibberish into Japanese for you.

Hope this helps.

wolfy

8:58 am on Feb 21, 2003 (gmt 0)

10+ Year Member



Takagi and Bill:

thanks a lot!

I've used the tool on that page and it reall works fine, no I can see what I wanted to see!

wolfy

bill

9:03 am on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member Top Contributors Of The Month



...so who gets the prize? ;)

wolfy

9:10 am on Feb 21, 2003 (gmt 0)

10+ Year Member



And the winner is...

Bill and Takagi first place in two , you will share a half pint of beer at next PubConf in London :)

thanks guys!

takagi

9:28 am on Feb 21, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I might not be there, next PubConf in London. Anyway, it is a first prize. And I'm only 'New User' to this forum. So looks like a nice start.

wolfy

11:46 am on Feb 21, 2003 (gmt 0)

10+ Year Member



<I might not be there, next PubConf in London.>

You should have to come there.

< Anyway, it is a first prize. And I'm only 'New User' to this forum. So looks like a nice start.>

great start really!