- Windows-1252
-
Windows-1252 or CP-1252 is a character encoding of the Latin alphabet, used by default in the legacy components of Microsoft Windows in English and some other Western languages. It is one version within the group of Windows code pages. In LaTeX packages, it is referred to as ansinew.
Contents
Details
The encoding is a superset of ISO 8859-1, but differs from the IANA's ISO-8859-1 by using displayable characters rather than control characters in the 0x80 to 0x9F range. It is known to Windows by the code page number 1252, and by the IANA-approved name "windows-1252". This code page also contains all the printable characters that are in ISO 8859-15 (though some are mapped to different code points).
It is very common to mislabel Windows-1252 text with the charset label ISO-8859-1. A common result was that all the quotes and apostrophes (produced by "smart quotes" in Microsoft software) were replaced with question marks or boxes on non-Windows operating systems, making text difficult to read. Most modern web browsers and e-mail clients treat the MIME charset ISO-8859-1 as Windows-1252 in order to accommodate such mislabeling. This is now standard behavior in the draft HTML 5 specification, which requires that documents advertised as ISO-8859-1 actually be parsed with the Windows-1252 encoding.[1]
Historically, the term "ANSI code page" (ACP) is used in Windows to refer to various code pages considered as native. The intention was that most of these would be ANSI standards such as ISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft now states that "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."[2]
Codepage layout
The following table shows Windows-1252. Each character is shown with its Unicode equivalent and its decimal code.
Microsoft cites Unicode mappings of Windows-1252 with "best fit",[3] which also includes the five unmapped C1 code points as well as code points that map to 1252 in a lossy fashion.
Windows-1252 (CP1252) −0 −1 −2 −3 −4 −5 −6 −7 −8 −9 −A −B −C −D −E −F
0−
NUL
0000
0SOH
0001
1STX
0002
2ETX
0003
3EOT
0004
4ENQ
0005
5ACK
0006
6BEL
0007
7BS
0008
8HT
0009
9LF
000A
10VT
000B
11FF
000C
12CR
000D
13SO
000E
14SI
000F
15
1−
DLE
0010
16DC1
0011
17DC2
0012
18DC3
0013
19DC4
0014
20NAK
0015
21SYN
0016
22ETB
0017
23CAN
0018
24EM
0019
25SUB
001A
26ESC
001B
27FS
001C
28GS
001D
29RS
001E
30US
001F
31
2−
SP
0020
32!
0021
33"
0022
34#
0023
35$
0024
36%
0025
37&
0026
38'
0027
39(
0028
40)
0029
41*
002A
42+
002B
43,
002C
44-
002D
45.
002E
46/
002F
47
3−
0
0030
481
0031
492
0032
503
0033
514
0034
525
0035
536
0036
547
0037
558
0038
569
0039
57:
003A
58;
003B
59<
003C
60=
003D
61>
003E
62?
003F
63
4−
@
0040
64A
0041
65B
0042
66C
0043
67D
0044
68E
0045
69F
0046
70G
0047
71H
0048
72I
0049
73J
004A
74K
004B
75L
004C
76M
004D
77N
004E
78O
004F
79
5−
P
0050
80Q
0051
81R
0052
82S
0053
83T
0054
84U
0055
85V
0056
86W
0057
87X
0058
88Y
0059
89Z
005A
90[
005B
91\
005C
92]
005D
93^
005E
94_
005F
95
6−
`
0060
96a
0061
97b
0062
98c
0063
99d
0064
100e
0065
101f
0066
102g
0067
103h
0068
104i
0069
105j
006A
106k
006B
107l
006C
108m
006D
109n
006E
110o
006F
111
7−
p
0070
112q
0071
113r
0072
114s
0073
115t
0074
116u
0075
117v
0076
118w
0077
119x
0078
120y
0079
121z
007A
122{
007B
123|
007C
124}
007D
125~
007E
126DEL
007F
127
8−
€
20AC
128‚
201A
130ƒ
0192
131„
201E
132…
2026
133†
2020
134‡
2021
135ˆ
02C6
136‰
2030
137Š
0160
138‹
2039
139Œ
0152
140Ž
017D
142
9−
‘
2018
145’
2019
146“
201C
147”
201D
148•
2022
149–
2013
150—
2014
151˜
02DC
152™
2122
153š
0161
154›
203A
155œ
0153
156ž
017E
158Ÿ
0178
159
A−
NBSP
00A0
160¡
00A1
161¢
00A2
162£
00A3
163¤
00A4
164¥
00A5
165¦
00A6
166§
00A7
167¨
00A8
168©
00A9
169ª
00AA
170«
00AB
171¬
00AC
172SHY
00AD
173®
00AE
174¯
00AF
175
B−
°
00B0
176±
00B1
177²
00B2
178³
00B3
179´
00B4
180µ
00B5
181¶
00B6
182·
00B7
183¸
00B8
184¹
00B9
185º
00BA
186»
00BB
187¼
00BC
188½
00BD
189¾
00BE
190¿
00BF
191
C−
À
00C0
192Á
00C1
193Â
00C2
194Ã
00C3
195Ä
00C4
196Å
00C5
197Æ
00C6
198Ç
00C7
199È
00C8
200É
00C9
201Ê
00CA
202Ë
00CB
203Ì
00CC
204Í
00CD
205Î
00CE
206Ï
00CF
207
D−
Ð
00D0
208Ñ
00D1
209Ò
00D2
210Ó
00D3
211Ô
00D4
212Õ
00D5
213Ö
00D6
214×
00D7
215Ø
00D8
216Ù
00D9
217Ú
00DA
218Û
00DB
219Ü
00DC
220Ý
00DD
221Þ
00DE
222ß
00DF
223
E−
à
00E0
224á
00E1
225â
00E2
226ã
00E3
227ä
00E4
228å
00E5
229æ
00E6
230ç
00E7
231è
00E8
232é
00E9
233ê
00EA
234ë
00EB
235ì
00EC
236í
00ED
237î
00EE
238ï
00EF
239
F−
ð
00F0
240ñ
00F1
241ò
00F2
242ó
00F3
243ô
00F4
244õ
00F5
245ö
00F6
246÷
00F7
247ø
00F8
248ù
00F9
249ú
00FA
250û
00FB
251ü
00FC
252ý
00FD
253þ
00FE
254ÿ
00FF
255−0 −1 −2 −3 −4 −5 −6 −7 −8 −9 −A −B −C −D −E −F Legend: yellow cells are control characters, blue cells are punctuation, purple cells are numbers, green cells are ASCII letters, and tan cells are international letters. Differences from ISO-8859-1 are marked with thick green borders.
According to the information on Microsoft's and the Unicode Consortium's websites, positions 81, 8D, 8F, 90, and 9D are unused. However, the Windows API call for converting from code pages to Unicode[which?] maps these to the corresponding C1 control codes. The euro character at position 80 was not present in earlier versions of this code page, nor were the S and Z with caron (háček).
See also
External links
- Windows 1252 reference chart
- IANA Charset Name Registration
- Unicode mapping table for Windows 1252
- Unicode mappings of windows 1252 with "best fit"
References
Character encodings Character sets Early telecommunications ASCII · ISO/IEC 646 · ISO/IEC 6937 · T.61 · sixbit code pages · Baudot code · Morse code · Chinese telegraph codeISO/IEC 8859 Bibliographic use National standards ArmSCII · CNS 11643 · GOST 10859 · GB 2312 · HKSCS · ISCII · JIS X 0201 · JIS X 0208 · JIS X 0212 · JIS X 0213 · KPS 9566 · KS X 1001 · PASCII · TIS-620 · TSCII · VISCII · YUSCIIEUC CN · JP · KR · TWISO/IEC 2022 CN · JP · KR · CCCIIMacOS codepages ("scripts") DOS codepages Windows codepages EBCDIC codepages 37/1140 · 273/1141 · 277/1142 · 278/1143 · 280/1144 · 284/1145 · 285/1146 · 297/1147 · 420/16804 · 424/12712 · 500/1148 · 838/1160 · 871/1149 · 875/9067 · 930/1390 · 933/1364 · 937/1371 · 935/1388 · 939/1399 · 1025/1154 · 1026/1155 · 1047/924 · 1112/1156 · 1122/1157 · 1123/1158 · 1130/1164 · JEF · KEISPlatform specific ATASCII · CDC display code · DEC-MCS · DEC Radix-50 · Fieldata · GSM 03.38 · HP roman8 · PETSCII · TI calculator character sets · WISCII · ZX Spectrum character setUnicode / ISO/IEC 10646 Miscellaneous codepages Related topics control character (C0 C1) · CCSID · Character encodings in HTML · charset detection · Han unification · ISO 6429/IEC 6429/ANSI X3.64 · mojibakeCategories:- Character sets
- Windows code pages
Wikimedia Foundation. 2010.