- Z-variant
In
Unicode , twoglyph s are said to be Z-variants (often spelled zVariants) if they share the same etymology but have slightly different appearances and different Unicodecodepoint s. For example, the Unicode characters U+8AAA 說 and U+8AAC 説 are Z-variants. The notion of Z-variance is only applicable to the “CJKV languages” — Chinese, Japanese, Korean and Vietnamese — and is a subtopic ofHan unification .Differences on the Z-axis
The Unicode philosophy of codepoint allocation for CJK languages is organized along three “axes.” The X-axis represents differences in semantics; for example, the Latin capital A (U+0041 A) and the Greek capital alpha (U+0391 Α) are represented by two distinct codepoints in Unicode, and might be termed “X-variants” (though this term is not common). The Y-axis represents significant differences in appearance though not in semantics; for example, the traditional Chinese character "māo" “cat” (U+8C93 貓) and the simplified Chinese character (U+732B 猫) are Y-variants. [http://www.unicode.org/glossary/]
The Z-axis represents minor typographical differences. For example, the Chinese characters (U+838A 莊) and (U+8358 荘) are Z-variants, as are (U+8AAA 說) and (U+8AAC 説). The glossary at "Unicode.org" [http://www.unicode.org/glossary/] defines “Z-variant” as “Two CJK unified ideographs with identical semantics and unifiable shapes,” where “unifiable” is taken in the sense of Han unification.
Thus, were Han unification perfectly successful, Z-variants would not exist. They exist in Unicode because it was deemed useful to be able to “round-trip” documents between Unicode and other CJK encodings such as
Big5 and CCCII. For example, the character 莊 has CCCII encoding 21552D, while its Z-variant 荘 has CCCII encoding 2D552D. Therefore, these two variants were given distinct Unicode codepoints, so that converting a CCCII document to Unicode and back would be alossless operation.Confusion
There is some confusion over the exact definition of “Z-variant.” For example, in an
Internet draft (of RFC 3743) dated April 2002, [http://www.watersprings.org/pub/id/draft-jseng-idn-admin-01.txt] one finds "bù" “no” (U+4E0D 不) and (U+F967 不) described as “font variants,” the term “Z-variant” being apparently reserved for interlanguage pairs such as the Chinese "tù" “rabbit” (U+5154 兔) and the Japanese "to" “rabbit” (U+514E 兎). However, theUnicode Consortium 's Unihan database [http://www.unicode.org/charts/unihan.html] treats both pairs as Z-variants.ee also
*
Backward compatibility
Wikimedia Foundation. 2010.