Forum Moderators: phranque

Message Too Old, No Replies

mod_rewrite and non-english letters / alphabets - HELP!

mod_rewrite foreign letters characters

         

jnmunsey

3:57 pm on Mar 12, 2004 (gmt 0)

10+ Year Member



ok I try this in my .htaccess file on my RH Linux server with Ensim:

Options +FollowSymlinks
RewriteEngine on
RewriteRule ^test/(.*)\.html$ test.php?id=$1 [QSA]

Then type this in the browser:

[myserver.com...]

This is test.php:

<html>
<body>
<?php
print_r( $_GET );
?>
</body>
</html>

And this is the output:

Array ( [id] => ändreä )

Why is ändreä being turned into ändreä?

Thanks!

[edited by: jdMorgan at 4:10 am (utc) on Mar. 15, 2004]
[edit reason] Removed e-mail personal URL per TOS [/edit]

jdMorgan

4:15 am on Mar 15, 2004 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



jnmunsey,

Welcome to WebmasterWorld [webmasterworld.com]!

The basic answer to this question is that accented/umlauted character sets are not directly supported. These "special" characters are converted to UTF character sequences for handling by Apache and browsers. Acoording to the HTTP specifications, these characters should not be used in URLs, because they will cause such problems when the browser and the composition utility (e.g. html editor) do not have the same default characters sets.

Jim

jnmunsey

5:18 am on Mar 15, 2004 (gmt 0)

10+ Year Member



Ok, in other words the standards makers didn't give a hoot about anything but the English language huh?

Unfortunately there are a LOT of sites that use these characters in them.

I was able to write a filter in some php code to account for these letters and make it work accordingly, so if anyone happens to do a search on this topic in the future this is what you will have to do.

In my case I had a shopping cart with tens of thousands of products, many of which had these characters. They needed to be indexed by search engines, so deleting them was not an option because real people who use a language other than English actually type these characters into search engines.

In any case the german language shopping cart I worked on uses URLs with these characters just fine now.

Thank

John M
Houston, Texas, USA ...