|Extended characters in keywords|
What's the best strategy?
I have a client with an English language site, but because their field is international, the site contains many good search terms that include extended characters, mostly in proper nouns and names.
How do search engines deal with characters like é and Ö and ü? Do they automatically substitute the "undecorated" glyphs e and O and u? Am I better off using HTML equivalents, such as & u u m l ; [drop the spaces]? Or is it better to forget about the extended characters altogether?
I wrestle with this off and on and haven't come to any firm conclusions. So the website currently has a hodgpodge of all three -- undecorated glyphs, extended characters and HTML equivalents. I feel like I'm missing an opportunity by not having a handle on this.
Tedster, I am assuming you have tried searching using both extended and normal characters? Did you see any significant differences in the results?
On some engines, the differences between the two are big, even though both characters show up in the SERP.
For instance, on Google compare:
I just noticed something by posting that -- Google encodes the "ü" as "%FC" That's yet another approach! So what does Googlebot do with & u u m l ; when it finds that on my pages?
Perhaps our European members have wrestled with this more in depth than I have. I just get boggled whenever I try to wade into this area.
Interesting, you have just opened my eyes a little as well. The spidering system I use "corrects" these extended characters and, although I am getting some referrals via European engines, if I have the correct characters it might broaden the playing field a little.
The two examples you offered certainly have different results, but did you notice that Google offered to correct Munchen, but not München?
I too will be interested in the opinions of our European friends.
Isn't this a variation on the theme of misspellings? Some of my keywords are commonly misspelled, and I want to get results from searches on either spelling. I use a css display:none in a span to put all variations on the page with only one visible.
fubar<span class=altspell>(foobar, fübar)</span>
>>I too will be interested in the opinions of our European friends.
Well, I cannot speak for them. But we use Extended caracters here for the french language. If you query some keyword popularity tool for our local market you can see that twice the people are not using extended characters while querying some SE. For instance try " maison a vendre " and " maison à vendre ". To fix it for the whole market you must write it both ways in the same site.
I like a lot tilt's approach! I am goin to give it a try to rewrite some phrases here and there. Thanks tilt!
Tilt, that is a clever approach. Are you getting hits with this method?
I'm a bit hesitant to try it on domains that aren't relatively disposable -- it is, after all, a new twist on invisible text. As long as the SEs aren't doing automatic spidering and crunching of .css files, it's safe, I'd suppose.
Certainly you have no dark, spammy intentions. But variations on that technique ARE being used for spamming and it feels to me that its days may be numbered.
This is one of those areas where I'd like to see search engines kick up the linguistics programming a bit.
Macguru, thanks for the French language input - it's what I was kind of guessing: "...twice the people are not using extended characters while querying..." But I'm also guessing the percentage changes significantly for each language.
It's too early to tell if I'm getting hits. It looks like google still is returning results from cached pages of my site. I just made the changes about 3 weeks ago. (Nobody other than google indexes me yet).
Yes, I'm a bit worried that it might be considered spammy. I'll let you know if I get the boot.