- Esperanto orthography
-
Esperanto Culture and mediaEsperantist · Esperantujo · Film
La Espero · Libraries
Literature · Music
Native speakers
Pop culture references
Publications · Symbols
Zamenhof Day
ProfanityAmikeca Reto · Kurso de Esperanto
Esperanto Academy · Encyclopedia
Pasporta Servo · TEJO · UEA · SAT
Plouézec Meetings · World Congress
Youth Congress · European Youth Week
Panamerican Congress
Skolta Esperanto LigoRelated topicsWikimediaPortal · Task force · Vikipedio
Vikivortaro · Vikicitaro · Vikifontaro
Vikilibroj · Vikikomunejo
Vikispecoj · VikinovaĵojEsperanto is written in a Latin alphabet of twenty-eight letters, with upper and lower case. This is supplemented by punctuation marks and by various logograms, such as the numerals 0–9, currency signs such as $, and mathematical symbols.
Twenty-two of the letters are identical in form to letters of the English alphabet (q, w, x, and y being omitted). The remaining six have diacritic marks, ĉ, ĝ, ĥ, ĵ, ŝ, and ŭ (that is, c, g, h, j, and s circumflex, and u breve). The full alphabet is:
A B C Ĉ D E F G Ĝ H Ĥ I J Ĵ K L M N O P R S Ŝ T U Ŭ V Z a b c ĉ d e f g ĝ h ĥ i j ĵ k l m n o p r s ŝ t u ŭ v z The letters have approximately the sound values of the IPA, with the exception of c [t͡s] and the circumflex letters ĉ [t͡ʃ], ĝ [d͡ʒ], ĥ [x], ĵ [ʒ], ŝ [ʃ]. There is a nearly one-to-one correspondence of letter to sound; the only significant exceptions being the sequence kz, as in ekzemple, which is frequently pronounced [ɡz], and n, which is frequently pronounced [ŋ] before g and k. (See Esperanto phonology.)
In handwritten Esperanto, the diacritics pose no problem. However, since they don't appear on standard alphanumeric keyboards, various alternate methods have been devised for representing them in printed and typed text. The original method was a set of digraphs now known as the "h-system", but with the rise of computer word processing a so-called "x-system" has become equally popular. These systems are described below. However, with the advent of Unicode, the need for such work-arounds has lessened.
Unique to the Esperanto script is the spesmilo (1000 specie) sign, an Sm monogram for a now-obsolete international unit of auxiliary Esperanto currency used by a few British and Swiss banks before World War I. It has been assigned the Unicode value U+20B7, though in ordinary fonts it is often transcribed as Sm, usually italic.
Contents
Origin
The script is modeled after Western Slavic scripts such as the Czech or Sorbian alphabet. However, the use of circumflexes instead of carons for the letters ĉ, ĝ, ĥ, ĵ, ŝ avoids the appearance of any particular Latin alphabet, and the non-Slavic bases g, j of the letters ĝ and ĵ, rather than Slavic dž, ž, help preserve the printed appearance of Latinate and Germanic vocabulary such as ĝenerala "general" and ĵurnalo "journal". The letter v stands for either v or w of other languages. The letter ŭ of the diphthongs aŭ, eŭ appears to be from the Belarusian Łacinka alphabet, historically associated with the Polish-Lithuanian Commonwealth. (Today Łacinka is strikingly similar to the Esperanto alphabet, but in Zamenhof's day it was closer to Polish; the convergence came with orthographic reforms two decades after Zamenhof went public with Esperanto.)
Due to these Slavic origins, the spelling of geographic names is sometimes divergent from English. This is especially remarked upon when English has the letters x, w, qu, or gu, as in Vaŝintono "Washington", Meksiko "Mexico", or Gvatemalo "Guatemala". However, such spellings are normal to several languages of Central, Northern, and Eastern Europe. Compare the Esperanto forms with Croatian Vašington, Meksiko, and Gvatemala. Likewise, cunamo, from Japanese tsunami, is similar to Czech and Latvian cunami.
Non-printed variants
Esperanto versions of international Morse code and Braille include the six diacritic letters. In Braille, the circumflex is indicated by adding a point at position 6 (lower right), and the u-breve is the mirror image of u. An Esperanto Braille magazine, Aŭroro, has been published since 1920.
There is a proposed manual alphabet as part of the Signuno project. Signuno itself, as signed Esperanto rather than a language in its own right, is a manual logographic Esperanto orthography. The Signuno alphabet deviates from international norms (that is, ASL with an Irish T) in that all letters are upright, with a straight wrist: the G is simply turned upright, while the H, P, Q are taken from Irish, the J from Russian, and the Z appears to be unique to Signuno. (It's shaped like an ASL 3, and appears to be derived from alphabetically adjacent V the way Ŭ was derived from adjacent U.) The diacritic letters Ŝ, Ĥ, Ĝ, Ŭ are derived from their base letters S, H, G, U; while Ĉ and Ĵ, like J, are Russian. Numerals 1–5 include the thumb, 6–9 do not, and 10, 100, 1000 are the Roman numerals X, C, M.
The names of the letters of the alphabet
Zamenhof simply tacked an -o onto each consonant to create the name of the letter, with the vowels representing themselves: a, bo, co, ĉo, do, e, fo, etc. The diacritics are frequently mentioned overtly. For instance, ĉ may be called ĉo ĉapela or co ĉapela, from ĉapelo (a hat), and ŭ may be called ŭo luneta or u luneta, from luno (a moon) plus the diminutive -et-. This is the only system that is widely accepted and in practical use.
The letters of the basic Latin alphabet not found in the Esperanto alphabet have distinct names, much as letters of the Greek alphabet do. ⟨Q⟩, ⟨x⟩, ⟨y⟩ are kuo, ikso, ipsilono. The other is ⟨w⟩; this has been called duobla vo (double V), germana vo (German V), and vavo (using Waringhien's convention for ⟨v⟩ below).
However, while this is fine for initialisms such as ktp [kotopo] for etc., it can be problematic when spelling out names. For example, several consonantal distinctions are difficult for many nationalities, who normally rely on the fact that Esperanto seldom uses these sounds to distinguish words (that is, they do not form many minimal pairs). Thus the pairs of letter names ĵo–ĝo, ĥo–ho (or ĥo–ko), co–ĉo (or co–so, co–to), lo–ro, and ŭo–vo (or vo–bo) are problematic. In addition, over a noisy telephone connection it quickly becomes apparent that voicing distinctions can be difficult to make out: noise confounds the pairs po–bo, to–do, ĉo–ĝo, ko–go, fo–vo, so–zo, ŝo–ĵo, as well as the nasals mo–no.
There have been several proposals to resolve this problem. Gaston Waringhien proposed changing the vowel of voiced obstruents to a, so that at least voicing is not problematic. Also changed to a are h, n, r, distinguishing them from ĥ, m, l. The result is perhaps the most common alternative in use:
-
- a, ba, co, ĉo, da, e, fo, ga, ĝa, ha, ĥo, i, jo, ĵa, ko, lo, mo, na, o, po, ra, so, ŝo, to, u, ŭo, va, za
However, this still requires overt mention of the diacritics, and even so does not reliably distinguish ba–va, co–so, ĉo–ŝo, or ĝa–ĵa.
The proposal closest to international norms (and thus the easiest to remember) that clarifies all the above distinctions is a modification of a proposal by Kálmán Kalocsay. As with Zamenhof, vowels stand for themselves, but it follows the international standard of placing vowel e after a consonant by default (be, ce, de, ge), but before sonorants (el, en) and voiceless fricatives (ef, es). The vowel a is used for ⟨h⟩ and the voiceless plosives ⟨p⟩, ⟨t⟩, ⟨k⟩, after the international names ha for ⟨h⟩ and ka for ⟨k⟩; the French name ĵi is used for ⟨ĵ⟩, the Greek name ĥi (chi) for ⟨ĥ⟩, and the English name ar for ⟨r⟩. The letter ⟨v⟩ has the i vowel of ĵi, distinguishing it from ⟨b⟩, but the other voiced fricative, ⟨z⟩, does not, to avoid the problem of it palatalizing and being confused with ĵi. The diphthong offglide ⟨ŭ⟩ is named eŭ, the only real possibility given Esperanto phonotactics besides aŭ, which as the word for "or" would cause confusion. The letter ⟨m⟩ is called om to distinguish it from ⟨n⟩; the vowel o alliterates well in the alphabetical sequence el, om, en, o, pa. There are other patterns to the vowels in the ABC rhyme: The lines start with a i a i and finish with a a e e. The letters with diacritics are placed at the end of the rhyme, taking the place of w, x, y in other Latin alphabets, so as not to disrupt the pattern of letters many people learned as children. All this makes the system more easily memorized than competing proposals. The modified Kalocsay abecedary is:
-
-
-
-
- a, be, ce, de, e, ef, ge, ha,
- i, je, ka, el, om, en, o, pa,
- ar, es, ta, u, vi, ĉa, ĝe,
- ĥi kaj ĵi, eŝ, eŭ kaj ze,
- plus ku', ikso, ipsilono,
- jen la abece-kolono.
-
-
-
(kaj means "and". The last line reads: lo! the ABC column)
Where letters are still confused, such as es vs eŝ or a vs ha, mention can be made of the diacritic (eŝ ĉapela), or to the manner of articulation of the sound (ha brueta "breathy aitch"). Quite commonly, however, people will use the aitch as in house strategy used in English.
Punctuation
As with most languages, punctuation is not completely standardized, but in Esperanto there is the additional complication of multiple competing national traditions.
Commas are required to introduce subordinate clauses (that is, before ke "that" or the ki- correlatives),
- Mi ne scias, kiel fari tion. (I don't know how to do that.)
The comma is also used for the decimal point, while thousands are separated by non-breaking spaces: 12 345 678,9.
Question marks (?) and exclamation marks (!) are used at the end of a clause, and may be internal to a sentence. Question words generally come at the beginning of a question, obviating the need for Spanish-style inverted question marks.
Periods may be used to indicate initialisms: k.t.p. or ktp for kaj tiel plu (et cetera), but not abbreviations that retain the grammatical suffixes. Instead, a hyphen optionally replaces the missing letters: D-ro or Dro for Doktoro (Dr). With ordinal numerals, the adjectival a and accusative n may be superscripted: 13a or 13ª (13ᵗʰ). The abbreviation k is used without a period for kaj (and); the ampersand (&) is not found. Roman numerals are also avoided.
The hyphen is also occasionally used to clarify compounds, and to join grammatical suffixes to proper names that haven't been Esperantized or don't have a nominal -o suffix, such as the accusative on Kalocsay-n or Kálmán-on. The proximate particle ĉi used with correlatives, such as ĉi tiu 'this one' and ĉi tie 'here', may be poetically used with nouns and verbs as well (ĉi jaro 'this year', esti ĉi 'to be here'), but if these phrases are then changed to adjectives or adverbs, a hyphen is used: ĉi-jare 'this year', ĉi-landa birdo 'a bird of this land'.[1]
Quotation marks show the greatest variety of any punctuation. The use of Esperanto quotation marks was never stated in Zamenhof's work; it was assumed that a printer would use whatever he had available (usually the national standard of the printer's country). — Dashes, « guillemets » (often »reversed«), “double apostrophes” (also often „reversed“), and more are all found. Since the age of word processing, however, the standard English quotation marks have become most widespread. Quotations may be introduced with either a comma or colon.
Capitalization
Capitalization is used for the first word of a sentence and for proper names when used as nouns. Names of months, days of the week, ethnicities, languages, and the adjectival forms of proper names are not typically capitalized (anglo "an Englishman", angla "English", usona "US American"), though national norms may override such generalizations. Titles are more variable: both the Romance style of capitalizing only the first word of the title and the English style of capitalizing all lexical words are found.
All capitals or small capitals are used for acronyms and initialisms of proper names, like TEJO, but not common expressions like ktp (etc.). Small capitals are also a common convention for family names, to avoid the confusion of varying national naming conventions: KALOCSAY Kálmán, Leslie CHEUNG Kwok Wing.
Camel case, with or without a hyphen, may occur when a prefix is added to a proper noun: la geZamenhofoj (the Zamenhofs), pra-Esperanto (Proto-Esperanto). It is also used for Russian-style syllabic acronyms, such as the name ReVo for Reta Vortaro ("Internet Dictionary"), which is homonymous with revo (dream). Occasionally mixed capitalization will be used for orthographic puns, such as espERAnto, which stands for the esperanta radikala asocio (Radical Esperanto Association).
Zamenhof contrasted informal ci with formal, and capitalized, Vi as the second-person singular pronouns. However, lower-case vi is now used as the second-person pronoun regardless of number.
ASCII transliteration systems
There are two alternate orthographies in common use, which replace the circumflex letters with either h digraphs or x digraphs. There are in also work-arounds such as approximating the circumflexes with carets.
H-system
The original method of working around the diacritics was developed by the creator of Esperanto himself, L. L. Zamenhof. He recommended using u in place of ŭ, and using digraphs with h for the circumflex letters. For example, ŝ is represented by sh, as in shi for ŝi (she), and shanco for ŝanco (chance).
Unfortunately this method suffers from several problems:
- h is already a consonant in the language, so digraphs occasionally make words ambiguous;
- when ŭ is changed to u, not only is there the occasional ambiguity, but a naive reading may place the stress on the wrong syllable;
- simplistic ASCII-based rules for sorting words fail badly for sorting h-digraphs, because lexicographically words in ĉ should follow all words in c and precede words in d. The word ĉu should be placed after ci, but sorted in the h-system, chu would appear before ci.
X-system
A more recent system for typing in Esperanto is the so-called "x-system", which uses x instead of h for the digraphs, including ux for ŭ. For example, ŝ is represented by sx, as in sxi for ŝi and sxanco for ŝanco.
X-digraphs solve those problems of the h-system:
- x is not a letter in the Esperanto alphabet, so its use introduces no ambiguity;
- words starting with cx are now nearly always correctly sorted after words starting with c. The sorting only fails in the special case of a z in compound or unassimilated words; for example, the compound word reuzi (to reuse) would be sorted after reŭmatismo (rheumatism). Such cases are rare.
The x-system has become as popular as the h-system, but many people dislike it for its perceived "odd" appearance – in the words of one Esperantist, it "aspektas klingone" (looks like Klingon).[2] Proponents argue that it would look "odd" only if one is expecting the appearance to resemble that of other European languages.
A practical problem of digraph substitution that the x-system does not completely resolve is in the complication of bilingual texts. Ux for ŭ is especially problematic when used alongside French text, because many French words end in aux or eux. Aux, for example, is a word in both languages (aŭ in Esperanto). Any automatic conversion of the text will alter the French words as well as the Esperanto. A few English words like "auxiliary" and "Euxine" can also suffer from such search-and-replace routines. A few people have proposed using "vx" instead of "ux" for ŭ to resolve this problem, but this variant of the system is rarely used. Some systems use xx to escape the ux to ŭ conversion, e.g. "auxx" would produce "aux".
Graphic work-arounds
There are several ad hoc work-arounds used in email or on the internet, where the proper letters are often not supported, as seen also in non-ASCII orthographies such as German. These "slipped-hat" conventions make use of the caret (^) or greater than sign (>) to represent the circumflex. For example, ŝanco may be written ^sanco, s^anco, or s>anco.[2] This has the twin advantages of being unambiguous while being iconic with the official orthography. However, they have generally fallen out of favor as Esperantists have gotten used to typing in the h or x systems with software that automatically converts these to standard orthography. Before the internet age, Stefano la Colla had proposed shifting the caret onto the following vowel, since French circumflex vowels are supported in printing houses. That is, one would write ehôsângôj cîujâude for eĥoŝanĝoj ĉiuĵaŭde.[3] However, this proposal has never been adopted.
Many new Esperantists from America and Western Europe, who are not used to diacritics from their own languages, perceive the Esperanto diacritics to be a problem, and often propose reforms to the orthography, sometimes with substantial modifications, in order to "fix" it. Such proposals are ignored by the community, both because they generally come from people who do not know the language well, and because reform projects tend to snowball, a fate that has destroyed several constructed languages.
The transliteration of Esperanto into ASCII is a topic known to cause flame wars with little constructive discussion. The reduction of such behavior is sometimes indicated as one of the main reasons to go to the extra effort of using the proper diacritics. With the advent of Unicode, transliteration systems are no longer necessary on web pages. Nonetheless, the h- and x-systems remain common on Usenet and in e-mail, where encoding support is rare and the limited availability of keyboard configurations often makes it cumbersome to type the diacritics.
Unicode
The entire Esperanto alphabet is part of the Latin-3 and Unicode character sets, and is included in WGL4. The code points and HTML entities for the special Esperanto characters in Unicode are:
Character Description Code Point HTML Ĉ C-circumflex U+0108 Ĉ ĉ c-circumflex U+0109 ĉ Ĝ G-circumflex U+011C Ĝ ĝ g-circumflex U+011D ĝ Ĥ H-circumflex U+0124 Ĥ ĥ h-circumflex U+0125 ĥ Ĵ J-circumflex U+0134 Ĵ ĵ j-circumflex U+0135 ĵ Ŝ S-circumflex U+015C Ŝ ŝ s-circumflex U+015D ŝ Ŭ U-breve U+016C Ŭ ŭ u-breve U+016D ŭ Practical Unicode for Esperanto
Microsoft Windows
Adjusting a keyboard to type Unicode is actually relatively easy (all Windows variants of the Microsoft Windows NT family, such as 2000 and XP, for example, support Unicode; Windows 9x does not natively support Unicode).
A simple and free utility with all the Esperanto keys already installed is called Esperanto keyboard layout for Microsoft Windows – (QWERTY version) this is available as a free download.
A similar tool is Ek, and is available without charge. You can download the keyboard by clicking on Instalilo: ek(version#)inst.exe. Ek uses the cx keying function to produce ĉ. It will work with most programs but there are some that it is not compatible with.
A commercial but still cheap tool is Šibboleth, a program that can produce every Latin character. It enables composition of ĝ etc. using the ^ deadkey (like for French letters), so you do not have to learn new key positions. The ŭ is produced by the combination u followed by #.
Many popular e-mail clients support Unicode, so you can happily use the tools described above to write e-mails using the Esperanto alphabet.
If you want to use a text editor that is Esperanto-compatible, make sure it supports Unicode, as do Editplus (UTF-8) and UniRed.
Linux
In Linux systems, configuration difficulty depends on whether your system version is old or new. This is so, because on old systems, it may be necessary to activate Unicode by setting the locale to a UTF-8 locale. There is a special eo_XX.UTF-8 locale available at Bertil Wennergren's home page, along with a thorough explanation of how one implements Unicode and the keyboard in Linux.
If the Linux system is recent, or kept updated, then the system is probably already working with Esperanto keys. For X11 and KDE, it's only necessary to switch to a keyboard layout that has Latin dead keys (for example, the "US International" keyboard), whenever the user wants to write in Esperanto. Some keyboards with dead keys are:
- In the US International keyboard, the dead circumflex is over the "6" key ("shift-6") and the dead breve is hidden over the "9" key ("altgr-shift-9").
- In the Brazilian ABNT2 keyboard, the dead circumflex has its own key together with dead tilde ("shift-~"), near the "Enter" key. The dead breve is hidden over the backslash ("altgr-shift-\") key.
- In the Portuguese keyboard, the dead tilde key, near the left shift key, has both the dead circumflex and the dead breve.
X11 standard mappings for Esperanto keys Keys / Layout US International Brazilian ABNT2 Portuguese ĉ shift-6 c shift-~ c shift-~ c Ĉ shift-6 shift-c shift-~ shift-c shift-~ shift-c ĝ shift-6 g shift-~ g shift-~ g Ĝ shift-6 shift-g shift-~ shift-g shift-~ shift-g ĥ shift-6 h shift-~ h shift-~ h Ĥ shift-6 shift-h shift-~ shift-h shift-~ shift-h ĵ shift-6 j shift-~ j shift-~ j Ĵ shift-6 shift-j shift-~ shift-j shift-~ shift-j ŝ shift-6 s shift-~ s shift-~ s Ŝ shift-6 shift-s shift-~ shift-s shift-~ shift-s ŭ altgr-shift-9 u altgr-shift-\ u altgr-shift-~ u Ŭ altgr-shift-9 shift-u altgr-shift-\ shift-u altgr-shift-~ shift+u Another option is to use a keyboard layout that supports the Compose key (usually mapped to the right alt or to one of the windows keys). Then, "compose-u u" will combine the character u with the breve, and "compose-shift-6 s" will combine the character s with the circumflex (assuming "shift-6" is the position of the caret).
In GNOME, there exists a separate keyboard layout for Esperanto, replacing unused characters in Esperanto with the non-ASCII characters. A separate keyboard layout for Esperanto is available in KDE, too.
If necessary, install and use high quality fonts that have Esperanto glyphs, like Microsoft Web core fonts (free for personal use) or DejaVu (The Bitstream Vera glyphs have the Bitstream Vera license and DejaVu extensions are in public domain).
There is also a neat applet available for the gnome-panel called "Character Palette" and one can add the following characters to a new palette for quick placement from their panel menu bar. ĈĉĜĝĤĥĴĵŜŝŬŭ The Character Palette applet makes for a quick and easy way to add Esperanto Characters to a web browser or text document. One must only simply select their newly created palette and click a letter and that letter will be waiting on their system clipboard waiting to be pasted into the document.
Mac OS X
On Mac OS X systems, Esperanto characters can be entered by activating the U.S. Extended keyboard layout in the "Input Menu" pane of the "International" system preferences.
When the U.S. Extended keyboard layout is active, Esperanto characters can be entered using multiple keystrokes using a simple mnemonic device: the 6 key contains the caret character, which looks like a circumflex, so option-6 places a caret over the following character; similarly, Option-b stands for breve, so option-b adds the breve mark over the next character.
The full reference is as follows:
Char Name Keystrokes Ĉ C-circumflex option-6 shift-c ĉ c-circumflex option-6 c Ĝ G-circumflex option-6 shift-g ĝ g-circumflex option-6 g Ĥ H-circumflex option-6 shift-h ĥ h-circumflex option-6 h Ĵ J-circumflex option-6 shift-j ĵ j-circumflex option-6 j Ŝ S-circumflex option-6 shift-s ŝ s-circumflex option-6 s Ŭ U-breve option-b shift-u ŭ u-breve option-b u Swedish Esperantists using Mac OS X can use the Finnish Extended layout, which comes with the OS. Finnish has the same alphabet and type layout as Swedish; the Finnish Extended layout adds functionality just like U.S. Extended, only using other key combinations (the breve appears when you type option+y and the circumflex when you type a circumflex).
Similarly, British users may use the Irish Extended layout, which differs from the U.S. Extended keyboard layout in several ways (preserving the simple option+vowel method of applying acute accents, important for the Irish language, and the £ sign on shift-3 like the UK layout), but uses the same "dead-keys" for modifiers as U.S. Extended for Esperanto characters.
In OS X it is also possible to create your own keyboard layouts, so it is relatively easy to have more convenient mappings, like for example one based on typing an x after the letter.
Locale
An Esperanto locale would use "." as the thousands separator and "," as the decimal separator. Time and date format among Esperantists is not as standardized as number format, but 24-hour time with colon between hour and minutes, and for dates, either yyyy-mm-dd or dd-mm-yyyy, would be international and unambiguous.
Other alphabets
While Esperanto officially uses Latin script, alphabets based on other scripts may be used instead. None of these are in widespread use.
- The Cyrillic script was formerly used for Esperanto before the means to typeset Latin was widespread in Russia.
- The Tengwar, invented by J. R. R. Tolkien, have been modified for use with Esperanto.[4]
- The Shavian alphabet, which was designed for English, has been modified for use with Esperanto.
See also
- Orthography
- Ĉ, Ĝ, Ĥ, Ĵ, Ŝ, Ŭ
References
External links
- Computer input
- Online Esperanto keyboard
- Esperanto QWERTY keyboard for Windows using spare keys
- Esperanto QWERTY keyboard for Mac and Windows using alt/opt keys
- Esperanto GKOS keyboard for Android phones/tablets with genuine support (language option in Tools menu)
- Esperanto Dvorak keyboard for Windows using spare keys
- EK – a program for Windows 9x/ME/NT/2000/XP that allows Esperanto letters to be typed in various ways
- Unired – Unicode plain text editor for Windows 95/98/NT/2000 (with E-o support)
- eoconv – a tool to convert text between various Esperanto orthographies and character encodings
Categories:- Latin alphabets
- Esperanto language
- Language orthographies
- Keyboard layouts
- Natural language and computing
-
Wikimedia Foundation. 2010.