Forum Moderators: coopster & phranque

Message Too Old, No Replies

Umlaut in Url

is it possible?

         

globay

4:26 pm on Mar 1, 2003 (gmt 0)

10+ Year Member



Is it possible to have German Umlauts like ä,ö,ü in the Url? On my site, I use Url rewriting and the folder after products is the name of the category that is used for title, h1 etc. For example:

domain.com/products/DÖNER :

<title>Döner</title> ...

Using the umlaut would be the easiest way and probably the best way to rank high in Google.

--
thanks, globay

hakre

4:38 pm on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi globay,

even if you can use umlauts in unix filesystems with no problems, the url of such a file has to be encoded.

nevertheless, i won't do this, because not all browsers know how to do this. it will mess your site. instead i would use oe, ae and ue which is compatible with google, too. the title tag is untouched of this limitation you can use &ouml; etc. for this.

andreasfriedrich

4:48 pm on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Döner auf WebmasterWorld. That´s funny ;)

globay

6:37 pm on Mar 1, 2003 (gmt 0)

10+ Year Member



Döner auf WebmasterWorld. That´s funny ;)

Well that was the first word to come up to my mind! ;-)

--

And I have another Problem with umlauts: When I insert a text containig special characters such as ä, ü, ß,... into an MySQL database, and then try to display the content, via PHP (echo $content;) I get strange symbols, such as ö! How can I prevent that?

bird

7:26 pm on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You need to encode the hex representation of the character with the %-escape.

domain.com/products/D%d6NER

Mahlzeit! ;)

bird

7:31 pm on Mar 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



When I insert a text containig special characters such as �, �, �,... into an MySQL database, and then try to display the content, via PHP (echo $contentwink I get strange symbols, such as ö! How can I prevent that?

Make sure that your page has the right character set declared (typically ISO-8859-1 for german language pages), either in the headers that the server sends with it, or through a http-equiv meta element in the HTML.

globay

7:38 pm on Mar 1, 2003 (gmt 0)

10+ Year Member



How likely is it that the umlaut will cause trouble? I could change the "ö" to oe, but then it would not be reversable, since it needs to distinguishable from the actual "oe", then I could use D%d6NER, but would not get the keyword stuffed in the domain.

May be I can get around the problem. Well, anyway thanks for your help.

--
globay

hakre

6:30 pm on Mar 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



hi globaly,

for your output problem, use phps' htmlentities() [php.net] function to convert the special chars into their according entities (ä to &auml;, ß to &szlig;, etc.). this function is really usefull to output into html files! ;)

seindal

1:33 pm on Mar 4, 2003 (gmt 0)

10+ Year Member



I once experimented with this, using the Danish letters æ ø and å, but it cannot be used reliably.

You can encode the characters in the links, but the problem is what happens when the user clicks on a link. This is browser specific.

Some browsers, like old Netscape, just sends the url as written, with 8-bit characters in it. Others, notably MSIE, encodes the url as UTF-8 before sending it to the web-server, so a 8-bit character arrives as two characters, of which the first has the 8th bit set. There is not indication whatsoever, in request headers or elsewhere, that the url is encoded in UTF-8. To further complicate matters, the UTF-8 encoding can be disables in at least some versions of MSIE, so you cannot even rely on the user-agent to decode an incoming url at the server.

The result is that if you use 8-bit characters in urls, you never know how it is coming back to your server. It could be unchanged or it could be UTF-8 encoded. Obviously, all sorts of tricks could be applied at the server end, trying out both urls, but if the encoding of non ascii urls is un-specified by the RFCs, other browsers might do completely different things.

My advise: avoid non ascii characters in urls.

René.

globay

5:14 pm on Mar 4, 2003 (gmt 0)

10+ Year Member



Réné,

thanks for your detailed answer. I will stick to your advice and try to avoid special characters.

--
globay