Forum Moderators: phranque
Options +FollowSymlinks
RewriteEngine on
RewriteRule ^test/(.*)\.html$ test.php?id=$1 [QSA]
Then type this in the browser:
[myserver.com...]
This is test.php:
<html>
<body>
<?php
print_r( $_GET );
?>
</body>
</html>
And this is the output:
Array ( [id] => ändreä )
Why is ändreä being turned into ändreä?
Thanks!
[edited by: jdMorgan at 4:10 am (utc) on Mar. 15, 2004]
[edit reason] Removed e-mail personal URL per TOS [/edit]
Welcome to WebmasterWorld [webmasterworld.com]!
The basic answer to this question is that accented/umlauted character sets are not directly supported. These "special" characters are converted to UTF character sequences for handling by Apache and browsers. Acoording to the HTTP specifications, these characters should not be used in URLs, because they will cause such problems when the browser and the composition utility (e.g. html editor) do not have the same default characters sets.
Jim
Unfortunately there are a LOT of sites that use these characters in them.
I was able to write a filter in some php code to account for these letters and make it work accordingly, so if anyone happens to do a search on this topic in the future this is what you will have to do.
In my case I had a shopping cart with tens of thousands of products, many of which had these characters. They needed to be indexed by search engines, so deleting them was not an option because real people who use a language other than English actually type these characters into search engines.
In any case the german language shopping cart I worked on uses URLs with these characters just fine now.
Thank
John M
Houston, Texas, USA ...