Zero-width non-joiner

Zero-width non-joiner

The zero-width non-joiner (ZWNJ) is a non-printing character used in the computerization of writing systems that make use of ligatures. When placed between two characters that would otherwise be connected into a ligature, a ZWNJ causes them to be printed in their final and initial forms, respectively. This is also an effect of a space character, but a ZWNJ is used when it is desirable to keep the words closer together.

The ZWNJ is encoded in Unicode as U+200C zero width non-joiner (HTML: ‌ ‌).

Contents

Use of the ZWNJ for correct typography

In certain languages, the ZWNJ is required for correct typography.

Correct (with ZWNJ) Incorrect Meaning
Display Picture Code Display Picture Code
أي‌بي‌إم Correct display of Arabic with ZWNJ.svg أي‌بي‌إم أيبيإم Incorrect display of Arabic without ZWNJ.svg أيبيإم Arabic 'IBM'
می‌خواهم Correct display of Persian with ZWNJ.svg می‌خواهم میخواهم Incorrect display of Persian without ZWNJ.svg میخواهم Persian 'I want to'
עֲו‌ֹנֹת Correct Hebrew Vav with Holam haser with ZWNJ.svg עֲו‌ֹנֹת עֲוֹנֹת Incorrect Hebrew Vav with Holam male without ZWNJ.svg עֲוֹנֹת Old Hebrew 'transgressions'
Auf‌lage Correct display of German with ZWNJ.svg Auf‌lage Auflage Incorrect display of German without ZWNJ.svg Auflage German 'edition'

With the Biblical Hebrew example, the placement of the holam dot to the left of the letter waw (ו) is correct for the translation. The placement of the dot over the ו makes it visually confusing with עוֹנוֹת, seasons. (In Modern Hebrew there is no reason to use the holam after the waw and the ligature is needed, so it is rarely used in Modern Hebrew typesetting.)

In the German example the prefix Auf- is separated from the rest of the word in order to break the ligature fl into the separate characters f and l.

Use of the ZWNJ for diplaying alternate forms

In the Devanagari script, insertion of a ZWNJ after a consonant with a halant or before a dependent vowel will prevent the characters from being joined properly. For example, the characters क् and typically combine to form क्ष, but when a ZWNJ is inserted between them, क्‌ष (code: क्‌ष) is seen instead.

See also

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать курсовую

Look at other dictionaries:

  • Zero-width joiner — The zero width joiner (ZWJ) is a non printing character ( #x200d; ) used in the computerized typesetting of some cursive scripts, such as the Arabic script or the Korean hangul script. When placed between two characters that would otherwise not… …   Wikipedia

  • Non-breaking space — In computer based text processing and digital typesetting, a non breaking space or no break space (NBSP) is a variant of the space character that prevents an automatic line break (line wrap) at its position. In certain formats (such as HTML), it… …   Wikipedia

  • Combining grapheme joiner — The combining grapheme joiner (CGJ), U+034F ͏ combining grapheme joiner (HTML: #847; ) is a Unicode character that has no visible glyph and is default ignorable by applications. Its name is a misnomer which does not describe the function of this… …   Wikipedia

  • Mapping of Unicode characters — Unicode’s Universal Character Set has a potential capacity to support over 1 million characters. Each UCS character is mapped to a code point which is an integer between 0 and 1,114,111 used to represent each character within the internal logic… …   Wikipedia

  • Unicode — For the 1889 Universal Telegraphic Phrase book, see Commercial code (communications). The Unicode official logo since October 2009 …   Wikipedia

  • Coupure de mot — Pour les articles homonymes, voir césure. En typographie et en orthographe, la coupure de mot, parfois appelée aussi césure, est l opération qui consiste à couper en fin de ligne un mot qui n’entrerait pas dans la justification. Cette… …   Wikipédia en Français

  • Unicode font — A Unicode font (also known as UCS font and Unicode typeface) is a computer font that contains a wide range of characters, letters, digits, glyphs, symbols, ideograms, logograms, etc., which are collectively mapped into the standard Universal… …   Wikipedia

  • Comparison of Unicode encodings — This article compares Unicode encodings. Two situations are considered: 8 bit clean environments and environments that forbid use of byte values that have the high bit set. Originally such prohibitions were to allow for links that used only seven …   Wikipedia

  • Unicode equivalence — is the specification by the Unicode character encoding standard that some sequences of code points represent essentially the same character. This feature was introduced in the standard to allow compatibility with preexisting standard character… …   Wikipedia

  • Unicode character property — Unicode assigns character properties to each code point.[1] These properties can be used to handle characters (code points) in processes, like in line breaking, script direction right to left or applying controls. Slightly inconsequently, some… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”