| 4:58 pm on Jul 26, 2002 (gmt 0)|
Personally I use the "franšais" form for a few reasons. First, search engine can't mess up searching for it, fewer characters in the html (every little bit helps :), and as long as Content-Type is set correctly and Unicode encoding is used, it seems to work in more browsers than the long form - especially when you get into less common letters.
P.S. Welcome to WMW
| 5:06 pm on Jul 26, 2002 (gmt 0)|
As a good practice, always use français.
It can become a size issue with large documents, I'll give you that, but it's safer for display purposes that way.
I've never had any display problems using the HTML entities in Netscape 4.X and IE 5 and above as well as Opera and others. It's a recommendation on the Government of Canada and they do testing in text to speech browser, braille ect. so I'm pretty positive that it's the safest way to go. Anybody disagree?
| 5:13 pm on Jul 26, 2002 (gmt 0)|
From my experience a lot of the official codes just aren't recognized by browsers (especially 4.x). I don't think it will be an issue for French. But some of the less common ones just aren't supported. (I'll try to get a group of examples together - but it won't be until after lunch!)
| 5:13 pm on Jul 26, 2002 (gmt 0)|
Could you elaborate a little bit on the search engine porblem you mentionned dcheney? You're getting me worried here :( although I never noticed any problem with my french sites.
<added>I'm too slow!! Thanks dcheney</added>
| 5:23 pm on Jul 26, 2002 (gmt 0)|
What about the unicode? Like ç for š? Does it work better than çe;? And maybe the spiders could read it (wishful thinking I guess)?
| 5:54 pm on Jul 26, 2002 (gmt 0)|
ok, all of the following are valid character entity references according to HTML 4.01 - see how many your browser can see (you should not see anything that looks like &name; ). (This is a subset of the whole group, I didn't bother with various accents on each vowel, uppercase variants, and greek/math stuff.)
*** hmm, looks like WMW won't let that form be interpretted :(
| 6:02 pm on Jul 26, 2002 (gmt 0)|
I see your point. Thanks for the info. Everything works fine in IE 5.0 but those entities doesn't in Netscape 4.78 (Windows 2000):
œ š ˆ ‘ ’ “ ” „ † ‡ ‰
| 6:20 pm on Jul 26, 2002 (gmt 0)|
Yes, all seems fine in IE, and those that Netscape has trouble with aren't any that I'll need to worry about for my site for the most part. Thanks for all the input and the welcome! :) Oh, and can anyone point me to some good threads on SEO for someone who knows absolutely nothing about the subject and needs to start from scratch??
| 6:52 pm on Jul 26, 2002 (gmt 0)|
The most important thing is to make sure that your pages have a character set declared. The french accents are part of ISO-8859-1 (Latin-1), so that's an obvious choice. The following line in the head section of a HTML document does the trick:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Once you have this (or an equivalent HTTP header sent by the server), you can use your accents just by writing them normally. However, you should make sure that your editing software doesn't use some Windows specific character set for those characters, they really need to be in Latin-1.
If your document is encoded with a different character set that doesn't include the cedille, then you need to use the &entity; spelling.
| 8:53 pm on Jul 26, 2002 (gmt 0)|
Bird, Thanks for the great tip. My editing software sets that charset by default (I beleive it's the same charset for English and French?) So it is good to know that it will cover me.
| 9:43 pm on Jul 26, 2002 (gmt 0)|
ISO-8859-1 covers the special characters from all western european languages.
| 1:55 am on Jul 27, 2002 (gmt 0)|
The most important thing is to make sure that your pages have a character set declared
That's the key.
If you use the named entity for (apmersand, name, semicolon), you can be sure that the user agent will get it right if it understands the HTML version in which the named entity first appeared. This will work with or without a charset declaration.
If you use plain text with accents but don't have a charset declaration, it may or may not render correctly depending on whether or not the user is using the same character set as the one your file is in.
If you declare the charset and the browser is capable of reading that declaration, you'll always get it right. Don't mess up though - you might have one program that saves as UTF-8 and another (probably older program) saving as ISO-8859-1 and you'll get unpredictable results (that's generally true, but pretty much every charset you might use will render standard ASCII characters the same way).
There's a useful page on that gives you pretty much the entire character set in both charsets and some useful notes at Blooberry. It may not be the definitive reference, but it's the only one I could actually understand