Tamil script

Tamil script

Infobox Writing system
name = Tamil
type = Abugida
time = c. 700–present
languages = Tamil
Sanskrit
fam1=Proto-Canaanite alphabet
fam2=Phoenician alphabet
fam3=Aramaic alphabet
fam4=Brāhmī
fam5=Grantha
sisters=Malayalam
Sinhala
children=Saurashtra
unicode = [http://www.unicode.org/charts/PDF/U0B80.pdf U+0B80–U+0BFF]
iso15924 = Taml

The Tamil script (தமிழ் அரிச்சுவடி "ISOtranslit|tamiẓ ariccuvaḍi" "Tamil alphabet", or வட்டெழுத்து "ISOtranslit|vaṭṭeẓuttu" "rounded writing") is a Vatteluttu script that is used to write the Tamil language. With the use of special diacritics to represent aspirated and voiced consonants not represented in the basic script, it is also used to write Saurashtra and, by Tamils, to write Sanskrit.

Overview

Characteristics

The Tamil script has twelve vowels (உயிரெழுத்து "ISOtranslit|uyirezhuthu" "soul-letters"), eighteen consonants (மெய்யெழுத்து "ISOtranslit|meyyeẓuttu" "body-letters") and one character, the "āytam" ISOtranslit|ஃ (ஆய்தம்), which is classified in Tamil grammar as being neither a consonant nor a vowel (அலியெழுத்து "ISOtranslit|aliyeẓuttu" "the hermaphrodite letter"). Though often part of the vowel set (உயிரெழுத்துக்கள் "ISOtranslit|uyirezhuthukkaḷ" "vowel class"), the script, however, is syllabic and not alphabetic [ [http://dsal.uchicago.edu/cgi-bin/romadict.pl?table=tamil-lex&page=148&display=utf8 University of Madras Tamil Lexicon, page 148: «அலியெழுத்து [ ISOtranslit|aliyeẕuttu ] n ISOtranslit|ali-y-eḻuttu . < அலி¹ +. 1. The letter ISOtranslit|ஃ , as being regarded neither a vowel nor a consonant; ஆய்தம். (வெண்பாப். முதன்மொ. 6, உரை.) 2. Consonants; ISOtranslit|மெய்யெ ழுத்து. (பிங்.).»] ] . The complete script, therefore, consists of the thirty-one letters in their independent form, and an additional 216 combinant letters representing a total 247 combinations (உயிர்மெய்யெழுத்து "ISOtranslit|uyirmeyyeẓuttu") of a consonant and a vowel, a mute consonant or a vowel alone. These combinant letters are formed by adding a vowel marker to the consonant. Some vowels require the basic shape of the consonant to be altered in a way that is specific to that vowel. Others are written by adding a vowel-specific suffix to the consonant, yet others a prefix, and finally some vowels require adding both a prefix and a suffix to the consonant. In every case the vowel marker is different from the standalone character for the vowel.

The Tamil script is written from left to right.

History

Tamil script is thought to have evolved from the Brahmi script, itself generally believed to derive from the Aramaic script of the Middle East. A small minority of scholars believe that Brahmi may have derived directly from the Indus script. Inscriptions from the second century AD use a later form of the Tamil Brahmi script, which is substantially similar to the writing system described in the "Tolkappiyam", an ancient Tamil grammar. Most notably, they use the unicode|"puḷḷi" to suppress the inherent vowel. The Tamil letters thereafter evolved towards a more rounded form, and by the fifth or sixth century AD had reached a form called the "early unicode|vaṭṭeḻuttu", the immediate ancestor of the unicode|"vaṭṭeḻuttu" ("rounded writing") script in use today. The rounded shape of the letters is partly the result of the fact that in ancient times, writing involved using a sharp-pointed stylus to carve the letters on palm leaves unicode|("olaiccuvaṭi"), a process which made it easier to produce curves than straight lines. Some scholars state that the script was originally called unicode|"veṭṭeḻuthu" meaning "script that was cut (on stone)", standing for ease of carving in stones.

In addition to producing rounder letters, the use of palm leaves as the primary medium for writing led to other changes in the Tamil script. The scribe had to be careful not to piercing the leaves with the stylus while writing, because a leaf with a hole was likelier to tear and decay faster. The result of this was that the use of the unicode|"puḷḷi" to distinguish pure consonants became rare, with pure consonants usually being written as if the inherent vowel were present. Similarly, the vowel marker for the unicode|"kuṟṟiyal ukaram", a half-rounded "u" which occurs at the end of some words and in the medial position in certain compound words, also fell out of use and was replaced by the marker for the simple "u." The unicode|"puḷḷi" did not fully reappear until the introduction of printing, but the marker unicode|"kuṟṟiyal ukaram" never came back into use, although the sound itself still exists and plays an important role in Tamil prosody.The forms of some of the letters were simplified in the nineteenth century to make the script easier to typeset. In the twentieth century, the script was simplified even further in a series of reforms, which regularised the vowel markers used with consonants by eliminating special markers and most irregular forms.

Relationship with other Indic scripts

The Tamil script differs from other Brahmi-derived scripts in a number of ways. Unlike every other Indic script, it uses the same character to represent both an unvoiced stop and its voiced equivalent. Thus the character க் "k", for example, represents both IPA| [k] , and IPA| [g] . This is because Tamil grammar treats only unvoiced stops as being "true" consonants, treating voiced and aspirated sounds are euphonic variants of unvoiced sounds. Traditional Tamil grammars contain detailed rules, observed in formal speech, for when a stop is to be pronounced with and without voice. These rules are not followed in colloquial or dialectal speech, where voiced and unvoiced versions of a stop are, in effect, allophones, being used in specific phonetic contexts, without serving to distinguish words.Also unlike other Indic scripts, the Tamil script hardly uses special consonantal ligatures to represent conjunct consonants, which are far less frequent in Tamil than in other Indian languages. Conjunct consonants, where they occur are written by writing the character for the first consonant, adding the unicode|"puḷḷi" to suppress its inherent vowel, and then writing the character for the second consonant. There are a few exceptions, namely க்ஷ "unicode|kṣa" and ஸ்ரீ "unicode|srī".

Tamil letters

Basic consonants

Consonants are called the 'body' ("mei") letters. The consonants are classified into three categories: "vallinam" (hard consants), "mellinam" (soft consonants, including all nasals), and "idayinam" (medium consonants).

There are some lexical rules for formation of words. "Tolkāppiyam" describes such rules. Some examples: a word cannot end in certain consonants, and cannot begin with some consonants including 'r' 'l' and 'll'; there are two consonants for the dental 'n' - which one should be used depends on whether the 'n' occurs at the start of the word and on the letters around it.

The order of the alphabet (strictly abugida) in Tamil closely matches that of the linguistically unrelated Indo-Aryan languages, reflecting the common origin of their scripts from Brahmi.

Vowels

Vowels are also called the 'life' ("uyir") or 'soul' letters. Together with the consonants (which are called 'body' letters), they form compound, syllabic (abugida) letters that are called 'living' letters ("uyirmei", i.e. letters that have both 'body' and 'soul').

Tamil vowels are divided into short and long (five of each type) and two diphthongs.

Isolated form

The special letter unicode|ஃ (pronounced 'akh') is rarely used by itself. It normally serves a purely grammatical function as the independent vowel form of the dot on consonants that suppresses the inherent 'a' sound in plain consonants. However, in modern times it has come to be used to represent foreign sounds - for example ஃ + ப is used to represent the English sound 'F', not found in Tamil.

The long ("nedil") vowels are about twice as long as the short ("kuRil") vowels. The diphthongs are usually pronounced about one and a half times as long as the short vowels, though some grammatical texts place them with the long ("nedil") vowels.

As can be seen in the compound form, the vowel sign can be added to the right, left or both sides of the consonants. It can also form a ligature. These rules are evolving and older use has more ligatures than modern use. What you actually see on this page depends on your font selection; for example, Code2000 will show more ligatures than Latha.

There are proponents of script reform who want to eliminate all ligatures and let all vowel signs appear on the right side.

Unicode encodes the character in logical order (always the consonant first), whereas legacy 8-bit encodings (such as TSCII) prefer the written order. This makes it necessary to reorder when converting from one encoding to another; it is not sufficient simply to map one set of codepoints to the other.

Compound table of Tamil letters

The following table lists vowel ("uyir" or life) letters across the top and consonant ("mei" or body) letters along the side, the combination of which gives all Tamil compound (uyirmei) letters.

Tamil in Unicode

The Unicode range for Tamil is U+0B80&ndash;U+0BFF. Grey areas indicate non-assigned code points. Most of the non-assigned codepoints are designated reserved because they are in the same relative position as characters assigned in other South Asian script blocks that correspond to phonemes that don't exist in the Tamil script.

Like other South Asian scripts in Unicode, the Tamil encoding was originaly derived from the ISCII standard. Both ISCII and Unicode encode Tamil as an abugida. Each codepoint representing a similar phoneme is encoded in the same relative position in each South Asian script block in Unicode. Although Unicode represents Tamil as an abugida, all the pure consonants and syllables in Tamil can be represented by combining multiple Unicode codepoints, as can be seen in the Unicode Tamil Syllabary below.

In Unicode 5.1, named sequences were added for all Tamil pure consonants and syllables.

Unicode Tamil Syllabary
Vowels →
Consonants

0B85

0B86

0B87

0B88

0B89

0B8A

0B8E

0B8F

0B90

0B92

0B93

0B94
க்
0B95
0BCD

0B95
கா
0B95
0BBE
கி
0B95
0BBF
கீ
0B95
0BC0
கு
0B95
0BC1
கூ
0B95
0BC2
கெ
0B95
0BC6
கே
0B95
0BC7
கை
0B95
0BC8
கொ
0B95
0BCA
கோ
0B95
0BCB
கௌ
0B95
0BCC
ங்
0B99
0BCD

0B99
ஙா
0B99
0BBE
ஙி
0B99
0BBF
ஙீ
0B99
0BC0
ஙு
0B99
0BC1
ஙூ
0B99
0BC2
ஙெ
0B99
0BC6
ஙே
0B99
0BC7
ஙை
0B99
0BC8
ஙொ
0B99
0BCA
ஙோ
0B99
0BCB
ஙௌ
0B99
0BCC
ச்
0B9A
0BCD

0B9A
சா
0B9A
0BBE
சி
0B9A
0BBF
சீ
0B9A
0BC0
சு
0B9A
0BC1
சூ
0B9A
0BC2
செ
0B9A
0BC6
சே
0B9A
0BC7
சை
0B9A
0BC8
சொ
0B9A
0BCA
சோ
0B9A
0BCB
சௌ
0B9A
0BCC
ஞ்
0B9E
0BCD

0B9E
ஞா
0B9E
0BBE
ஞி
0B9E
0BBF
ஞீ
0B9E
0BC0
ஞு
0B9E
0BC1
ஞூ
0B9E
0BC2
ஞெ
0B9E
0BC6
ஞே
0B9E
0BC7
ஞை
0B9E
0BC8
ஞொ
0B9E
0BCA
ஞோ
0B9E
0BCB
ஞௌ
0B9E
0BCC
ட்
0B9F
0BCD

0B9F
டா
0B9F
0BBE
டி
0B9F
0BBF
டீ
0B9F
0BC0
டு
0B9F
0BC1
டூ
0B9F
0BC2
டெ
0B9F
0BC6
டே
0B9F
0BC7
டை
0B9F
0BC8
டொ
0B9F
0BCA
டோ
0B9F
0BCB
டௌ
0B9F
0BCC
ண்
0BA3
0BCD

0BA3
ணா
0BA3
0BBE
ணி
0BA3
0BBF
ணீ
0BA3
0BC0
ணு
0BA3
0BC1
ணூ
0BA3
0BC2
ணெ
0BA3
0BC6
ணே
0BA3
0BC7
ணை
0BA3
0BC8
ணொ
0BA3
0BCA
ணோ
0BA3
0BCB
ணௌ
0BA3
0BCC
த்
0BA4
0BCD

0BA4
தா
0BA4
0BBE
தி
0BA4
0BBF
தீ
0BA4
0BC0
து
0BA4
0BC1
தூ
0BA4
0BC2
தெ
0BA4
0BC6
தே
0BA4
0BC7
தை
0BA4
0BC8
தொ
0BA4
0BCA
தோ
0BA4
0BCB
தௌ
0BA4
0BCC
ந்
0BA8
0BCD

0BA8
நா
0BA8
0BBE
நி
0BA8
0BBF
நீ
0BA8
0BC0
நு
0BA8
0BC1
நூ
0BA8
0BC2
நெ
0BA8
0BC6
நே
0BA8
0BC7
நை
0BA8
0BC8
நொ
0BA8
0BCA
நோ
0BA8
0BCB
நௌ
0BA8
0BCC
ப்
0BAA
0BCD

0BAA
பா
0BAA
0BBE
பி
0BAA
0BBF
பீ
0BAA
0BC0
பு
0BAA
0BC1
பூ
0BAA
0BC2
பெ
0BAA
0BC6
பே
0BAA
0BC7
பை
0BAA
0BC8
பொ
0BAA
0BCA
போ
0BAA
0BCB
பௌ
0BAA
0BCC
ம்
0BAE
0BCD

0BAE
மா
0BAE
0BBE
மி
0BAE
0BBF
மீ
0BAE
0BC0
மு
0BAE
0BC1
மூ
0BAE
0BC2
மெ
0BAE
0BC6
மே
0BAE
0BC7
மை
0BAE
0BC8
மொ
0BAE
0BCA
மோ
0BAE
0BCB
மௌ
0BAE
0BCC
ய்
0BAF
0BCD

0BAF
யா
0BAF
0BBE
யி
0BAF
0BBF
யீ
0BAF
0BC0
யு
0BAF
0BC1
யூ
0BAF
0BC2
யெ
0BAF
0BC6
யே
0BAF
0BC7
யை
0BAF
0BC8
யொ
0BAF
0BCA
யோ
0BAF
0BCB
யௌ
0BAF
0BCC
ர்
0BB0
0BCD

0BB0
ரா
0BB0
0BBE
ரி
0BB0
0BBF
ரீ
0BB0
0BC0
ரு
0BB0
0BC1
ரூ
0BB0
0BC2
ரெ
0BB0
0BC6
ரே
0BB0
0BC7
ரை
0BB0
0BC8
ரொ
0BB0
0BCA
ரோ
0BB0
0BCB
ரௌ
0BB0
0BCC
ல்
0BB2
0BCD

0BB2
லா
0BB2
0BBE
லி
0BB2
0BBF
லீ
0BB2
0BC0
லு
0BB2
0BC1
லூ
0BB2
0BC2
லெ
0BB2
0BC6
லே
0BB2
0BC7
லை
0BB2
0BC8
லொ
0BB2
0BCA
லோ
0BB2
0BCB
லௌ
0BB2
0BCC
வ்
0BB5
0BCD

0BB5
வா
0BB5
0BBE
வி
0BB5
0BBF
வீ
0BB5
0BC0
வு
0BB5
0BC1
வூ
0BB5
0BC2
வெ
0BB5
0BC6
வே
0BB5
0BC7
வை
0BB5
0BC8
வொ
0BB5
0BCA
வோ
0BB5
0BCB
வௌ
0BB5
0BCC
ழ்
0BB4
0BCD

0BB4
ழா
0BB4
0BBE
ழி
0BB4
0BBF
ழீ
0BB4
0BC0
ழு
0BB4
0BC1
ழூ
0BB4
0BC2
ழெ
0BB4
0BC6
ழே
0BB4
0BC7
ழை
0BB4
0BC8
ழொ
0BB4
0BCA
ழோ
0BB4
0BCB
ழௌ
0BB4
0BCC
ள்
0BB3
0BCD

0BB3
ளா
0BB3
0BBE
ளி
0BB3
0BBF
ளீ
0BB3
0BC0
ளு
0BB3
0BC1
ளூ
0BB3
0BC2
ளெ
0BB3
0BC6
ளே
0BB3
0BC7
ளை
0BB3
0BC8
ளொ
0BB3
0BCA
ளோ
0BB3
0BCB
ளௌ
0BB3
0BCC
ற்
0BB1
0BCD

0BB1
றா
0BB1
0BBE
றி
0BB1
0BBF
றீ
0BB1
0BC0
று
0BB1
0BC1
றூ
0BB1
0BC2
றெ
0BB1
0BC6
றே
0BB1
0BC7
றை
0BB1
0BC8
றொ
0BB1
0BCA
றோ
0BB1
0BCB
றௌ
0BB1
0BCC
ன்
0BA9
0BCD

0BA9
னா
0BA9
0BBE
னி
0BA9
0BBF
னீ
0BA9
0BC0
னு
0BA9
0BC1
னூ
0BA9
0BC2
னெ
0BA9
0BC6
னே
0BA9
0BC7
னை
0BA9
0BC8
னொ
0BA9
0BCA
னோ
0BA9
0BCB
னௌ
0BA9
0BCC

ee also

*Brahmic family
* [http://ta.wikibooks.org/wiki/%E0%AE%A4%E0%AE%AE%E0%AE%BF%E0%AE%B4%E0%AF%8D_%E0%AE%8E%E0%AE%B4%E0%AF%81%E0%AE%A4%E0%AF%8D%E0%AE%A4%E0%AF%81%E0%AE%95%E0%AE%B3%E0%AF%8D தமிழ் எழுத்துகள் (Tamil letters)]

Notes

References

* Steever, Sanford B. (1996) "Tamil Writing" in William R. Bright and Peter B. Daniels (eds.) "The World's Writing Systems". New York: Oxford University Press. ISBN 0-19-507993-0

External links

* [http://www.unc.edu/~echeran/paadanool/pdf/introduction.pdf Tamil Alphabet & Basics] (PDF)
* [http://ccat.sas.upenn.edu/plc/tamilweb/book/chapter1/node5.html#SECTION00120000000000000000 Phonetics of spoken Tamil]
* [http://www.fileformat.info/info/unicode/char/0b85/index.htm Unicode Character]
* [http://www.unicode.org/charts/PDF/U0B80.pdf Unicode Chart] - For Tamil (PDF)
* [http://www.microsoft.com/globaldev/nlsweb/default.asp?submitted=449 NLS Information] - NLS information page for Windows XP
* [http://www.iit.edu/~laksvij/language/tamil.html Transliterator] - A means to transliterate romanized text to Unicode Tamil.
* [http://unicode.org/faq/indic.html Unicode Consortium Indic Scripts FAQ]
* [http://www.unicode.org/versions/Unicode5.0.0/ch09.pdf Unicode Standard for South Asian scripts]
* [http://www.unicode.org/versions/Unicode5.1.0/#Tamil_Named_Character_Sequences Tamil Named Character Sequences in Unicode 5.1]


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Tamil Script Code for Information Interchange — (TSCII) is a coding scheme for representing the Tamil script. The lower 128 codepoints are plain ASCII, the upper 128 codepoints are TSCII specific. After long years of being used on the Internet by private agreement only, it was successfully… …   Wikipedia

  • Tamil Script Code for Information Interchange — (Tamilische Schriftkodierung für Informationsaustausch, TSCII) ist ein Zeichensatz, der für die Zeichenkodierung der tamilischen Schrift in Computersystemen benutzt wird. TSCII stellt eine Erweiterung des ASCII Zeichensatzes dar. TSCII kodiert… …   Deutsch Wikipedia

  • Tamil language — Tamil தமிழ் tamiḻ Pronunciation [t̪ɐmɨɻ] Spoken in India, Sri Lanka, Singapore, where it has offi …   Wikipedia

  • Tamil — may refer to: * Tamil language, one of the Dravidian languages primarily spoken in South Asia and South East Asia * Tamil script, primarily used to write the Tamil language * Tamil peopleee also* Ancient Tamil country * Dravidian martial arts *… …   Wikipedia

  • Tamil language — Dravidian language spoken by more than 63 million people. It is an official language of Tamil Nadu state in India and one of the official languages of Sri Lanka. Large Tamil speaking communities also reside in Malaysia and Singapore, South Africa …   Universalium

  • Tamil grammar — Much of Tamil grammar is extensively described in the oldest available grammar book for Tamil, the Tolkāppiyam . Modern Tamil writing is largely based on the 13th century grammar IAST|Naṉṉūl which restated and clarified the rules of the… …   Wikipedia

  • Tamil bell — The Tamil Bell is a broken bronze bell discovered in approximately 1836 by the missionary William Colenso. It was being used as a pot to boil potatoes by Māori women near Whangarei in the Northland Region of New Zealand.The bell is 13 cm long and …   Wikipedia

  • Tamil-Brahmi — was an early variant of the Brahmi script used to write Tamil characters. It is also known as the Tamili script. OriginIravatham Mahadevan, an authority on the Tamil Brahmi script states that the Brahmi script reached the Tamil country due the… …   Wikipedia

  • Tamil Nadu — தமிழ் நாடு   State   Tamil Nadu Emblem …   Wikipedia

  • Tamil-Kannada languages — Tamil Kannada is a hypothetical inner branch (Zvelebil 1990:56) of the South Dravidian I (SDr I) subfamily of the Southern Dravidian languages that include Tamil and Kannada. (It must be noted that there have been slight differences in the way… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”