Forum Moderators: open
Computers don't see text as shapes, they see numeric codes. A character set is a table which translates between the codes and shapes. The same code used to generate the letter "a" in one character set might be used to display a hiragana "shu" (or a Braille sequence, or a bullet, or an ancient Mycenean Linear B footstool ideogram) on another. This is why, for instance, "smart quotes" on older Mac programs rendered as funny accent marks in Windows programs, and vice versa-- they used different character sets.
Unicode (http://www.unicode.org/ ) as its name implies is a project to create a universal set containing all the letters, numbers, punctuation marks, and so on for all the major alphabets of the world. UTF-8 is a standard set of Unicode characters essentially combining dozens of existing character sets into one. According to the specification, all XML processors must support UTF-8 (and UTF-16) at a minimum.
ISO-8859-1, also known as Latin-1, is a character set containing characters used by Western European languages (inc. English, French, Spanish, German). It predates UTF-8 but nowadays can be considered a subset of it.
Windows-1252 is a proprietary character set for Western European languages created by Microsoft and widely used in Windows applications.
That said, you should label your data according to the form it actually takes. If your content is encoded as Windows-1252 (for instance, because it was generated in Microsoft Word), you can't just change the label so to speak and call it UTF-8; the characters will display improperly or not at all.