Forum Moderators: coopster
Eg. Scandinavian characters æ, å, Hungarian ö, Spanish ñ etc etc
I'm not talking about unicode (chinese characters etc though I'd really like to know more about these)
I currently have a directory structure that uses place names, currently I'm using o instead of ö for these directory structures, and using preg "\W" to "-" to remove punctuation characters.
What I'd like to do is to allow all foreign characters (that are included in the Extended ASCII set) as filenames and directory names. (Extended Ascii is missing the long ö & ü from the Hungarian char set ... sigh)
So what problems have I not thought of? My initial tests suggest that Apache and the OS (Linux) will be fine with retrieving these.
Next the preg functions - \w doesn't match these types of characters. Reading egrep man page suggests that if I set the local with setlocale(LC_CTYPE,''); I can make this match. The problem is that it is for only one language - post php 4.3 I can send in multiple languages but this seems crazy... is there a more simple way?
[php.net...]
As always any ideas or suggestions are much appreciated.
UTF-8, ISO-8859-1, PHP and XHTML [webmasterworld.com]
Saving foreign characters into the database [webmasterworld.com]
Matching international characters [webmasterworld.com]
Greek letters (Numeric Character References) in MySQL? [webmasterworld.com]
Removing umlauts from PHP [webmasterworld.com]