Forum Moderators: open
How do search engines deal with characters like é and Ö and ü? Do they automatically substitute the "undecorated" glyphs e and O and u? Am I better off using HTML equivalents, such as & u u m l ; [drop the spaces]? Or is it better to forget about the extended characters altogether?
I wrestle with this off and on and haven't come to any firm conclusions. So the website currently has a hodgpodge of all three -- undecorated glyphs, extended characters and HTML equivalents. I feel like I'm missing an opportunity by not having a handle on this.
For instance, on Google compare:
München [google.com]
to
Munchen [google.com].
I just noticed something by posting that -- Google encodes the "ü" as "%FC" That's yet another approach! So what does Googlebot do with & u u m l ; when it finds that on my pages?
Perhaps our European members have wrestled with this more in depth than I have. I just get boggled whenever I try to wade into this area.
The two examples you offered certainly have different results, but did you notice that Google offered to correct Munchen, but not München?
I too will be interested in the opinions of our European friends.
Onya
Woz
Well, I cannot speak for them. But we use Extended caracters here for the french language. If you query some keyword popularity tool for our local market you can see that twice the people are not using extended characters while querying some SE. For instance try " maison a vendre " and " maison à vendre ". To fix it for the whole market you must write it both ways in the same site.
I like a lot tilt's approach! I am goin to give it a try to rewrite some phrases here and there. Thanks tilt!
I'm a bit hesitant to try it on domains that aren't relatively disposable -- it is, after all, a new twist on invisible text. As long as the SEs aren't doing automatic spidering and crunching of .css files, it's safe, I'd suppose.
Certainly you have no dark, spammy intentions. But variations on that technique ARE being used for spamming and it feels to me that its days may be numbered.
This is one of those areas where I'd like to see search engines kick up the linguistics programming a bit.
Macguru, thanks for the French language input - it's what I was kind of guessing: "...twice the people are not using extended characters while querying..." But I'm also guessing the percentage changes significantly for each language.
Yes, I'm a bit worried that it might be considered spammy. I'll let you know if I get the boot.