- ISO/IEC 8859-2
ISO 8859-2, more formally cited as ISO/IEC 8859-2 or less formally as Latin-2, is part 2 of ISO/IEC 8859, a standard
character encoding defined by ISO. It encodes what it refers to asLatin alphabet no. 2, consisting of 191 characters from the Latin script, each encoded as a single 8-bit code value.ISO_8859-2:1987, more commonly known by its preferred mime name of ISO-8859-2 (note extra hyphen), is the IANA charset name for this standard used together with the control codes from
ISO/IEC 6429 for the C0 (0x00-0x1F) and C1(0x80-0x9F) parts. Escape sequences (from ISO/IEC 6429 orISO/IEC 2022 ) are notto be interpreted. This character set also has the aliases ISO_8859-2, latin2, l2 and csISOLatin2.This encoding shares a lot of assignments with
windows-1250 but is not a strict subset of it (unlike the case withwindows-1252 andISO 8859-1 ).These code values can be used in almost any data interchange system to communicate in the following European languages: Bosnian, Croatian, Czech, Hungarian, Polish, Romanian (also, see next paragraph), Serbian (in Latin transcription), Serbo-Croatian, Slovak, Slovenian, Upper Sorbian and Lower Sorbian. Furthermore it is suitable to represent some western European languages like Finnish (with the exception of å used in Swedish and Finnish) or German. When used alone, these latter languages are nominally using
ISO 8859-1 encoding, but the needed codepoints are shared with ISO 8859-2, which is an important aspect for multi-lingual documents.It may be argued that ISO 8859-2 is not really suitable for Romanian because of lack of letters s and t with commas below, containing s and t with cedillas instead. These letters were unified in the first versions of the
Unicode standard, meaning that the appearance with cedilla or with comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should, therefore, have characters with comma below at those code points.Code page layout
In the following table characters are shown together with their corresponding
Unicode code points. Note that code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-2. Code 20 is the regular SPACE character, and A0 is the NON-BREAKING SPACE. Code AD is a SOFT HYPHEN, which should not appear at all in compliant web browsers.External links
* [http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=28246&ICS1=35&ICS2=40&ICS3= ISO 8859-2:1999]
* [http://www.ecma-international.org/publications/standards/Ecma-094.htm Standard ECMA-94] : 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 "2nd edition (June 1986)"
* [http://www.itscj.ipsj.or.jp/ISO-IR/101.pdf ISO-IR 101] Right-Hand Part of Latin Alphabet No.2 "(February 1, 1986)"
* [http://nl.ijs.si/gnusl/cee/iso8859-2.html ISO 8859-2 (Latin 2) Resources]
Wikimedia Foundation. 2010.