- ISO/IEC 646
-
ISO/IEC 646:1991, Information technology — ISO 7-bit coded character set for information interchange, is an ISO standard that since its first edition in 1972 has specified a 7-bit character code from which several national standards are derived. ISO/IEC 646 was also ratified by ECMA as ECMA-6.
Since the portion of ISO/IEC 646 shared by all countries (the "invariant set") specified only those letters used in the basic modern Latin alphabet, other countries using the Latin alphabet with extensions needed to create national variants of ISO 646 to be able to use their native scripts. Since universal acceptance of the 8 bit byte did not exist at that time, the national characters had to be made to fit within the constraints of 7 bits, meaning that some characters that appear in ASCII do not appear in other national variants of ISO 646.
Contents
History
ISO/IEC 646 and its predecessor ASCII (ANSI X3.4) largely endorsed existing practice regarding character encodings in the telecommunications industry.
As ASCII did not provide a number of characters needed for languages other than English, a number of national variants were made that substituted some less-used characters with needed ones. Due to the incompatibility of the various national variants, an International Reference Version (IRV) of ISO/IEC 646 was introduced, in an attempt to at least restrict the replaced set to the same characters in all variants. The original version (ISO 646 IRV) differed from ASCII only in that in code point 0024, ASCII's dollar sign ($) was replaced by the international currency symbol (¤). The final 1991 version of the code is identical to ASCII.[1]
The ISO 8859 series of standards governing 8-bit character encodings supersede the ISO 646 international standard and its national variants, by providing 96 additional characters with the additional bit and thus avoiding any substitution of ASCII codes. The ISO 10646 standard, directly related to Unicode, supersedes all of the ISO 646 and ISO 8859 sets with one unified set of character encodings using a larger 21-bit value.
A legacy of ISO/IEC 646 is visible on Windows, where in some fonts or locales, the backslash character used in filenames is rendered as ¥ or other characters. Despite the fact that a different code for ¥ was available even on the original IBM PC, so much text was created with the backslash code used for ¥ that even modern Windows fonts have found it necessary to render the code that way. Another legacy is the existence of trigraphs in the C programming language.
Codepage layout
The following table shows the ISO/IEC 646 character set. Each character is shown with the hex code of its Unicode equivalent and the decimal value of the ISO/IEC 646 code. Grey shaded cells indicate code points with character glyphs that vary from region to region. These are discussed in detail below.
ISO/IEC 646 −0 −1 −2 −3 −4 −5 −6 −7 −8 −9 −A −B −C −D −E −F
0−
NUL
0000
0SOH
0001
1STX
0002
2ETX
0003
3EOT
0004
4ENQ
0005
5ACK
0006
6BEL
0007
7BS
0008
8HT
0009
9LF
000A
10VT
000B
11FF
000C
12CR
000D
13SO
000E
14SI
000F
15
1−
DLE
0010
16DC1
0011
17DC2
0012
18DC3
0013
19DC4
0014
20NAK
0015
21SYN
0016
22ETB
0017
23CAN
0018
24EM
0019
25SUB
001A
26ESC
001B
27FS
001C
28GS
001D
29RS
001E
30US
001F
31
2−
SP
0020
32!
0021
33"
0022
34
35
36%
0025
37&
0026
38'
0027
39(
0028
40)
0029
41*
002A
42+
002B
43,
002C
44-
002D
45.
002E
46/
002F
47
3−
0
0030
481
0031
492
0032
503
0033
514
0034
525
0035
536
0036
547
0037
558
0038
569
0039
57:
003A
58;
003B
59<
003C
60=
003D
61>
003E
62?
003F
63
4−
64A
0041
65B
0042
66C
0043
67D
0044
68E
0045
69F
0046
70G
0047
71H
0048
72I
0049
73J
004A
74K
004B
75L
004C
76M
004D
77N
004E
78O
004F
79
5−
P
0050
80Q
0051
81R
0052
82S
0053
83T
0054
84U
0055
85V
0056
86W
0057
87X
0058
88Y
0059
89Z
005A
90
91
92
93
94_
005F
95
6−
96a
0061
97b
0062
98c
0063
99d
0064
100e
0065
101f
0066
102g
0067
103h
0068
104i
0069
105j
006A
106k
006B
107l
006C
108m
006D
109n
006E
110o
006F
111
7−
p
0070
112q
0071
113r
0072
114s
0073
115t
0074
116u
0075
117v
0076
118w
0077
119x
0078
120y
0079
121z
007A
122
123
124
125
126DEL
007F
127National variants
Some national variants of ISO 646 are:
Code ISO-
IRStandard Used in CA-1 121 CSA Z243.4-1985 Canada (nr. 1 alternative, with “î”)
(French, classical)CA-2 122 CSA Z243.4-1985 Canada (nr. 2 alternative, with “É”)
(French, reformed orthography)CN 057 GB/T 1988-80 People's Republic of China (Basic Latin) CU 151 NC 99-10:81 Cuba (Spanish) DE 021 DIN 66003 Germany (German) DK — DS 2089 Denmark (Danish) FI 010 SFS 4017 Finland (basic version) FR 069 AFNOR NF Z 62010-1982 France (French) FR-0 025 AFNOR NF Z 62010-1973 France (obsolete since April 1985) GB 004 BS 4730 United Kingdom (English) GR 088 HOS ELOT Greece (obsolete) HU 086 MSZ 7795/3 Hungary (Hungarian) IE 207 NSAI 433:1996 Ireland (Irish) Code ISO-
IRStandard Used in INV 170 ISO 646:1983 Invariant subset IRV 002 ISO 646:1983 International Reference Variant JA 014 JIS C 6220-1969 Japan (Romaji) JA-O 092 JIS C 6229-1984 Japan (OCR-B) KR — KS C 5636-1989 South Korea MT — ? Malta (Maltese, English) NO 060 NS 4551 version 1 Norway NO-2 061 NS 4551 version 2 Norway (obsolete since June 1987) SE 010 SEN 85 02 00 Annex B Sweden (basic Swedish) SE-C 011 SEN 85 02 00 Annex C Sweden (extended Swedish for names) T.61 102 ITU/CCITT T.61 Recommendation International (Teletex) TW — CNS 5205-1996 Republic of China (Taiwan) US 006 ANSI X3.4-1968 United States (ASCII) YU 141 JUS I.B1.002 (YUSCII) former Yugoslavia (Croatian, Slovene, Serbian, Bosnian) Other proprietary standards approved later for international use by some standard committees:
Code ISO-
IRApproved by Origin Used in ES 085 ECMA IBM Spain (Basque, Castilian, Catalan, Galician) esp 017 ECMA Olivetti Spanish (international) DK-SE 009-1 SIS NATS, main set Sweden and Denmark (journalistic texts) Code ISO-
IRApproved by Origin Used in FI-SE 008-1 SIS NATS, main set Sweden and Finland (journalistic texts) ita 015 ECMA Olivetti Italian PT 084 ECMA IBM Portugal (Portuguese, Spanish) por 016 ECMA Olivetti Portuguese (international) The specifics of the changes for some of these variants are given in this table:
Codes Characters for each ISO 646 compatible charset binary dec hex INV T.61 US JA JA-O KR CN TW IRV GB DK NO NO-2 FI,SE SE-C DE HU FR FR-0 CA-1 CA-2 IE IS ita por PT esp ES CU MT YU 010 0010 34 22 " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " " 010 0011 35 23 # # # # # # # # £ # # § # # # # £ £ # # £ # £ # £ # # # # # 010 0100 36 24 ¤ $ $ $ $ ¥ $ $ $ $ $ $ ¤ ¤ $ ¤ $ $ $ $ $ $ $ $ $ $ $ ¤ $ $ 010 1001 39 27 ' ' ' ' ' ' ' ' ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ’ ' ’ ’ ’ ’ ’ ’ ’ ’ 010 1100 44 2C , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , 010 1101 45 2D - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 010 1111 47 2F / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / 100 0000 64 40 @ @ @ @ @ @ @ @ @ @ @ @ @ É § Á à à à à Ó Ð § § ´ § · @ @ Ž 101 1011 91 5B [ [ [ [ [ [ [ [ [ Æ Æ Æ Ä Ä Ä É ° ° â â É Þ ° Ã Ã ¡ ¡ ¡ ġ Š 101 1100 92 5C \ ¥ ¥ ₩ \ \ \ \ Ø Ø Ø Ö Ö Ö Ö ç ç ç ç Í \ ç Ç Ç Ñ Ñ Ñ ż Đ 101 1101 93 5D ] ] ] ] ] ] ] ] ] Å Å Å Å Å Ü Ü § § ê ê Ú Æ é Õ Õ ¿ Ç ] ħ Ć 101 1110 94 5E ^ ^ ^ ^ ^ ^ ˆ ˆ ˆ ˆ ˆ ˆ Ü ˆ ˆ ^ ˆ î É Á Ö ˆ ˆ ˆ ˆ ¿ ¿ ˆ Č 101 1111 95 5F _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 110 0000 96 60 ` ` ` ` ` ` ` ` ` ` ` é ` á µ µ ô ô ó ð ù ` ` ` ` ` ċ ž 111 1011 123 7B { { { { { { { { æ æ æ ä ä ä é é é é é é þ à ã ã ° ´ ´ Ġ š 111 1100 124 7C | | | | | | | | | ø ø ø ö ö ö ö ù ù ù ù í | ò ç ç ñ ñ ñ Ż đ 111 1101 125 7D } } } } } } } } å å å å å ü ü è è è è ú æ è õ õ ç ç [ Ħ ć 111 1110 126 7E ~ ‾ ‾ ‾ ‾ ˜ ˜ ˜ ¯ | ˜ ü ß ˝ ¨ ¨ û û á ö ì ° ˜ ˜ ¨ ¨ Ċ č In the table above, the cells with non-white background emphasize the differences from the US variant used in the Basic Latin subset of ISO/IEC 10646 and Unicode.
The characters displayed in cells with red background could be used as combining characters, when preceded or followed with a backspace C0 control. This encoding method may be considered deprecated.
Later, when wider character sets gained more acceptance, ISO 8859, vendor-specific character sets and eventually Unicode became the preferred methods of coding most of these variants.
Variants of ASCII that are not ISO 646
There are also some 7-bit character sets that are not officially part of the ISO 646 standard. Examples include:
- 7-bit Greek, ELOT 927. The Greek alphabet is mapped to positions 0x61–0x71 and 0x73–0x79, on top of the Latin lowercase letters.
- 7-bit Cyrillic, KOI-7 or Short KOI. The Cyrillic characters are mapped to positions 0x60–0x7E, on top of the Latin lowercase letters. Superseded by the KOI-8 variants.
- 7-bit Hebrew, SI 960. The Hebrew alphabet is mapped to positions 0x60–0x7A, on top of the lowercase Latin letters (and grave accent for aleph). 7-bit Hebrew was always stored in visual order. This mapping with the high bit set, i.e. with the Hebrew letters in 0xE0–0xFA, is ISO 8859-8.
- 7-bit Arabic, ASMO 449. The Arabic alphabet is mapped to positions 0x41–0x5A and 0x60–0x6A, on top of both uppercase and lowercase Latin letters. This mapping with the high bit set is ISO 8859-6.
See also
- Basic Latin (disambiguation)
- ASCII
- Extended ASCII
- EBCDIC
- ISO/IEC 2022
- ISO 8859
- ISO 10646
- ISO/IEC 6937 (ANSI)
- C0 Controls and Basic Latin, in Unicode
- C Trigraph
References
External links
- Zeichensatz nach ISO 646 (ASCII) (in German)
- History at GNU Aspell website
- Character Tables by Koichi Yasuoka (see Domestic ISO646 Character Tables and Quasi-ISO646 Character Tables)
- Turkish Text Deasciifier a tool (based on statistical pentagram analysis of the Turkish language) which reverts an ASCII'fied Turkish text by determining the appropriate (but ambiguous) diacritics normally needed in Turkish but missing in the US-ASCII set.
Character encodings Character sets Early telecommunications ASCII · ISO/IEC 646 · ISO/IEC 6937 · T.61 · sixbit code pages · Baudot code · Morse code · Chinese telegraph codeISO/IEC 8859 Bibliographic use National standards ArmSCII · CNS 11643 · GOST 10859 · GB 2312 · HKSCS · ISCII · JIS X 0201 · JIS X 0208 · JIS X 0212 · JIS X 0213 · KPS 9566 · KS X 1001 · PASCII · TIS-620 · TSCII · VISCII · YUSCIIEUC CN · JP · KR · TWISO/IEC 2022 CN · JP · KR · CCCIIMacOS codepages ("scripts") DOS codepages Windows codepages EBCDIC codepages 37/1140 · 273/1141 · 277/1142 · 278/1143 · 280/1144 · 284/1145 · 285/1146 · 297/1147 · 420/16804 · 424/12712 · 500/1148 · 838/1160 · 871/1149 · 875/9067 · 930/1390 · 933/1364 · 937/1371 · 935/1388 · 939/1399 · 1025/1154 · 1026/1155 · 1047/924 · 1112/1156 · 1122/1157 · 1123/1158 · 1130/1164 · JEF · KEISPlatform specific ATASCII · CDC display code · DEC-MCS · DEC Radix-50 · Fieldata · GSM 03.38 · HP roman8 · PETSCII · TI calculator character sets · WISCII · ZX Spectrum character setUnicode / ISO/IEC 10646 Miscellaneous codepages Related topics control character (C0 C1) · CCSID · Character encodings in HTML · charset detection · Han unification · ISO 6429/IEC 6429/ANSI X3.64 · mojibakeList of Ecma standards ISO standards Lists: List of ISO standards · List of ISO romanizations · List of IEC standards
Categories: Category:ISO standards · Category:OSI protocols1
to
99991 · 2 · 3 · 4 · 5 · 6 · 7 · 9 · 16 · 31 (-0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13) · 128 · 216 · 217 · 226 · 228 · 233 · 259 · 269 · 302 · 306 · 428 · 518 · 519 · 639 (-1, -2, -3, -5, -6) · 646 · 690 · 732 · 764 · 843 · 898 · 1000 · 1004 · 1007 · 1073-1 · 1413 · 1538 · 1745 · 2014 · 2015 · 2022 · 2108 · 2145 · 2146 · 2240 · 2281 · 2709 · 2711 · 2788 · 3029 · 3103 · 3166 (-1, -2, -3) · 3297 · 3307 · 3602 · 3864 · 3901 · 3977 · 4031 · 4157 · 4217 · 5218 · 5775 · 5776 · 5800 · 5964 · 6166 · 6344 · 6346 · 6425 · 6429 · 6438 · 6523 · 6709 · 7001 · 7002 · 7098 · 7185 · 7200 · 7498 · 7736 · 7810 · 7811 · 7812 · 7813 · 7816 · 8000 · 8178 · 8217 · 8571 · 8583 · 8601 · 8632 · 8652 · 8691 · 8807 · 8820-5 · 8859 (-1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11, -12, -13, -14, -15, -16) · 8879 · 9000/9001 · 9075 · 9126 · 9241 · 9362 · 9407 · 9506 · 9529 · 9564 · 9594 · 9660 · 9897 · 9945 · 9984 · 9985 · 999510000
to
1999910006 · 10118-3 · 10160 · 10161 · 10165 · 10179 · 10206 · 10303 (-11, -21, -22, -28, -238) · 10383 · 10487 · 10585 · 10589 · 10646 · 10664 · 10746 · 10861 · 10957 · 10962 · 10967 · 11073 · 11170 · 11179 · 11404 · 11544 · 11783 · 11784 · 11785 · 11801 · 11898 · 11940 · 11941 · 11941 (TR) · 11992 · 12006 · 12182:1998 · 12207 · 12234-2 · 13211 (-1, -2) · 13216 · 13250 · 13399 · 13406-2 · 13407 · 13450 · 13485 · 13490 · 13567 · 13568 · 13584 · 13616 · 14000 · 14031 · 14396 · 14443 · 14496-10 · 14496-14 · 14644 (-1, -2, -3, -4, -5, -6, -7, -8, -9) · 14649 · 14651 · 14698 · 14698-2 · 14750 · 14882 · 14971 · 15022 · 15189 · 15288 · 15291 · 15292 · 15408 · 15444 · 15445 · 15438 · 15504 · 15511 · 15686 · 15693 · 15706 · 15706-2 · 15707 · 15897 · 15919 · 15924 · 15926 · 15926 WIP · 15930 · 16023 · 16262 · 16750 · 17024 · 17025 · 17369 · 17799 · 18000 · 18004 · 18014 · 18245 · 18629 · 18916 · 19005 · 19011 · 19092-1 · 19092-2 · 19114 · 19115 · 19125 · 19136 · 19439 · 19501:2005 · 19752 · 19757 · 19770 · 19775-1 · 19794-520000+ See also: All articles beginning with "ISO" Categories:- ISO standards
- Character sets
- IEC standards
- Ecma standards
Wikimedia Foundation. 2010.