Forum Moderators: open
So I would like to know what are the preferred by the search engines?
Moreover I have inside the file name of my URLs al lot of small squares: ex. www.site.com/page[][][][][][].php
That is due to there are chinese words inside the name of my files.
Do you know if that could be a problem or it is useful beacase the search engine are able to read these small squares inside the file name?
A least, I have inside the head of the pages this metatag:
<META http-equiv="Content-Type" content="text/html; charset=utf-8">
Do you think that it is better I replace it with this?
<META http-equiv="Content-Type" content="text/html; charset=big5">
Thanks
AndyouKnow
You have a lot of questions here. ;) I can try to address a few.
This forum does not handle Asian languages very well. We will have to make due by describing things in English.
Inside my web site I noticed that inside the html code there are 2 kind of chinese characters:
Some of them are numeric characters: 会版权
The others instead are simbol character
So I would like to know what are the preferred by the search engines?
Moreover I have inside the file name of my URLs al lot of small squares: ex. www.site.com/page[][][][][][].php
That is due to there are chinese words inside the name of my files.
Do you know if that could be a problem or it is useful because the search engine are able to read these small squares inside the file name?
Although it is possible to use Chinese for file names your server may not be handling the character set correctly. It could just be a browser display issue. However, you would be a lot safer if you used ASCII file names.
Many English language SEOs promote the use of keywords in folder and file names. However, there can be issues using Asian languages, especially if it's not done properly. You could be risking ranking opportunities if some SE's cannot access your pages due to naming issues.
I use UNICODE entity characters (actually of the form � as well as �) for Chinese (and Japanese) characters, but keep all URLs strictly 7-bit safe ASCII without even any %xx escapes (eg no spaces).
Thus my content type is in the first form that you suggest, ie like:
content="text/html; charset=utf-8"
Rgds
Damon