- Compound (linguistics)
In linguistics, a compound is a lexeme (less precisely, a word) that consists of more than one stem. Compounding or composition is the word formation that creates compound lexemes (the other word-formation process being derivation). Compounding or Word-compounding refers to the faculty and device of language to form new words by combining or putting together old words. In other words, compound, compounding or word-compounding occurs when a person attaches two or more words together to make them one word. The meanings of the words interrelate in such a way that a new meaning comes out which is very different from the meanings of the words in isolation.
- 1 Formation of compounds
- 2 Subclasses
- 3 Examples from different languages
- 4 Recent trends
- 5 Compounding by language
- 6 See also
- 7 Notes
- 8 References
- 9 External links
Formation of compounds
Compound formation rules vary widely across language types.
In a synthetic language, the relationship between the elements of a compound may be marked with a case or other morpheme. For example, the German compound Kapitänspatent consists of the lexemes Kapitän (sea captain) and Patent (license) joined by an -s- (originally a genitive case suffix); and similarly, the Latin lexeme paterfamilias contains the archaic genitive form familias of the lexeme familia (family). Conversely, in the Hebrew language compound, the word בֵּית סֵפֶר bet sefer (school), it is the head that is modified: the compound literally means "house-of book", with בַּיִת bayit (house) having entered the construct state to become בֵּית bet (house-of). This latter pattern is common throughout the Semitic languages, though in some it is combined with an explicit genitive case, so that both parts of the compound are marked.
Agglutinative languages tend to create very long words with derivational morphemes. Compounds may or may not require the use of derivational morphemes also. The longest compounds in the world may be found in the Finnish and Germanic languages. In German, extremely extendable compound words can be found in the language of chemical compounds, where in the cases of biochemistry and polymers, they can be practically unlimited in length. German examples include Farbfernsehgerät (color television set), Funkfernbedienung (radio remote control), and the jocular word Donaudampfschifffahrtsgesellschaftskapitänsmütze (Danube steamboat shipping company Captain's hat).
In Finnish there is no theoretical limit to the length of compound words, but in practice words consisting of more than three components are rare. Even those can look mysterious to non-Finnish, take hätäuloskäytävä (emergency exit) as an example. Internet folklore sometimes suggests that lentokonesuihkuturbiinimoottoriapumekaanikkoaliupseerioppilas (Airplane jet turbine engine auxiliary mechanic non-commissioned officer student) would be the longest word in Finnish, but evidence of it actually being used is scant and anecdotic at best.
Compounds can be rather long when translating technical documents from English to some other language, for example, Swedish. "Motion estimation search range settings" can be directly translated to rörelseuppskattningssökintervallsinställningar; the length of the words are theoretically unlimited, especially in chemical terminology.
A common semantic classification of compounds yields four types:
- exocentric (also bahuvrihi)
- copulative (also dvandva)
An endocentric compound consists of a head, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning. For example, the English compound doghouse, where house is the head and dog is the modifier, is understood as a house intended for a dog. Endocentric compounds tend to be of the same part of speech (word class) as their head, as in the case of doghouse. (Such compounds were called tatpuruṣa in the Sanskrit tradition.)
Exocentric compounds (called a bahuvrihi compound in the Sanskrit tradition) are hyponyms of some unexpressed semantic head (e.g. a person, a plant, an animal...), and their meaning often cannot be transparently guessed from its constituent parts. For example, the English compound white-collar is neither a kind of collar nor a white thing. In an exocentric compound, the word class is determined lexically, disregarding the class of the constituents. For example, a must-have is not a verb but a noun. The meaning of this type of compound can be glossed as "(one) whose B is A", where B is the second element of the compound and A the first. A bahuvrihi compound is one whose nature is expressed by neither of the words: thus a white-collar person is neither white nor a collar (the collar's colour is a metaphor for socioeconomic status). Other English examples include barefoot and Blackbeard.
Copulative compounds are compounds which have two semantic heads.
Appositional compounds refer to lexemes that have two (contrary) attributes which classify the compound.
Type Description Examples endocentric A+B denotes a special kind of B darkroom, smalltalk exocentric A+B denotes a special kind of an unexpressed semantic head skinhead, paleface (head: 'person') copulative A+B denotes 'the sum' of what A and B denote bittersweet, sleepwalk appositional A and B provide different descriptions for the same referent actor-director, maidservant
Most natural languages have compound nouns. The positioning of the words (i. e. the most common order of constituents in phrases where nouns are modified by adjectives, by possessors, by other nouns, etc.) varies according to the language. While Germanic languages, for example, are left-branching when it comes to noun phrases (the modifiers come before the head), the Romance languages are usually right-branching.
In French, compound nouns are often formed by left-hand heads with prepositional components inserted before the modifier, as in chemin-de-fer 'railway' lit. 'road of iron' and moulin à vent 'windmill', lit. 'mill (that works)-by-means-of wind'.
In Turkish, one way of forming compound nouns is as follows: yeldeğirmeni ‘windmill’ (yel: wind, değirmen-i:mill-possessive); demiryolu 'railway'(demir: iron, yol-u: road-possessive).
A type of compound that is fairly common in the Indo-European languages is formed of a verb and its object, and in effect transforms a simple verbal clause into a noun.
In Spanish, for example, such compounds consist of a verb conjugated for third person singular, present tense, indicative mood followed by a noun (usually plural): e.g., rascacielos (modelled on "skyscraper", lit. 'scratches skies'), sacacorchos ('corkscrew', lit. 'removes corks'), guardarropas ('wardrobe', lit. 'stores clothing'). These compounds are formally invariable in the plural (but in many cases they have been reanalyzed as plural forms, and a singular form has appeared). French and Italian have these same compounds with the noun in the singular form: Italian grattacielo, 'skyscraper'; French grille-pain, 'toaster' (lit. 'toasts bread') and torche-cul 'ass-wipe' (Rabelais: See his "propos torcheculatifs").
This construction exists in English, generally with the verb and noun both in uninflected form: examples are spoilsport, killjoy, breakfast, cutthroat, pickpocket, dreadnought, and know-nothing.
Also common in English is another type of verb–noun (or noun–verb) compound, in which an argument of the verb is incorporated into the verb, which is then usually turned into a gerund, such as breastfeeding, finger-pointing, etc. The noun is often an instrumental complement. From these gerunds new verbs can be made: (a mother) breastfeeds (a child) and from them new compounds mother-child breastfeeding, etc.
In the Australian Aboriginal language Jingulu, (a Pama–Nyungan language), it is claimed that all verbs are V+N compounds, such as "do a sleep", or "run a dive", and the language has only three basic verbs: do, make, and run.
A special kind of composition is incorporation, of which noun incorporation into a verbal root (as in English backstabbing, breastfeed, etc.) is most prevalent (see below).
Verb–verb compounds are sequences of more than one verb acting together to determine clause structure. They have two types:
- In a serial verb, two actions, often sequential, are expressed in a single clause. For example, Ewe trɔ dzo, lit. "turn leave", means "turn and leave", and Hindi जाकर देखो jā-kar dekh-o, lit. "go-CONJUNCTIVE PARTICIPLE see-IMPERATIVE", means "go and see". In each case, the two verbs together determine the semantics and argument structure.
Serial verb expressions in English may include What did you go and do that for?, or He just upped and left; this is however not quite a true compound since they are connected by a conjunction and the second missing arguments may be taken as a case of ellipsis.
- In a compound verb (or complex predicate), one of the verbs is the primary, and determines the primary semantics and also the argument structure. The secondary verb, often called a vector verb or explicator, provides fine distinctions, usually in temporality or aspect, and also carries the inflection (tense and/or agreement markers). The main verb usually appears in conjunctive participial (sometimes zero) form. For examples, Hindi निकल गया nikal gayā, lit. "exit went", means 'went out', while निकल पड़ा nikal paRā, lit. "exit fell", means 'departed' or 'was blurted out'. In these examples निकल nikal is the primary verb, and गया gayā and पड़ा paRā are the vector verbs. Similarly, in both English start reading and Japanese 読み始める yomihajimeru "start-CONJUNCTIVE-read" "start reading," the vector verbs start and 始める hajimeru "start" change according to tense, negation, and the like, while the main verbs reading and 読み yomi "reading" usually remain the same. An exception to this is the passive voice, in which both English and Japanese modify the main verb, i.e. start to be read and 読まれ始める yomarehajimeru lit. "read-PASSIVE-(CONJUNCTIVE)-start" start to be read. With a few exceptions all compound verbs alternate with their simple counterparts. That is, removing the vector does not affect grammaticality at all nor the meaning very much: निकला nikalā '(He) went out.' In a few languages both components of the compound verb can be finite forms: Kurukh kecc-ar ker-ar lit. "died-3pl went-3pl" '(They) died.'
- Compound verbs are very common in some languages, such as the northern Indo-Aryan languages Hindi-Urdu and Panjabi where as many as 20% of verb forms in running text are compound. They exist but are less common in Dravidian languages and in other Indo-Aryan languages like Marathi and Nepali, in Tibeto-Burman languages like Limbu and Newari, in potentially Altaic languages like Turkish, Korean, Japanese, Kazakh, Uzbek, and Kyrgyz, and in northeast Caucasian languages like Tsez and Avar.
- Under the influence of a Quichua substrate speakers living in the Ecuadorian altiplano have innovated compound verbs in Spanish:
- De rabia puso rompiendo la olla, 'In anger (he/she) smashed the pot.' (Lit. from anger put breaking the pot)
- Botaremos matándote 'We will kill you.' (Cf. Quichua huañuchi-shpa shitashun, lit. kill-CP throw.1plFut, तेरे को मार डालेंगे )
- Compound verb equivalents in English (examples from the internet):
- What did you go and do that for?
- If you are not giving away free information on your web site then a huge proportion of your business is just upping and leaving.
- Big Pig, she took and built herself a house out of brush.
- Caution: In descriptions of Persian and other Iranian languages the term 'compound verb' refers to noun-plus-verb compounds, not to the verb–verb compounds discussed here.
Compound prepositions formed by prepositions and nouns are common in English and the Romance languages (consider English on top of, Spanish encima de, etc.). Japanese shows the same pattern, except the word order is the opposite (with postpositions): no naka (lit. "of inside", i.e. "on the inside of"). Hindi has a small number of simple (i.e., one-word) postpositions and a large number of compound postpositions, mostly consisting of simple postposition ke followed by a specific postposition (e.g., ke pas, "near"; ke nīche, "underneath").
Examples from different languages
- mashkikiwaaboo 'tonic': mashkiki 'medicine' + waaboo 'liquid'
- miskomin 'raspberry': misko 'red' + miin 'berry'
- dibik-giizis 'moon': dibik 'night' + giizis 'sun'
- gichi-mookomaan 'white person/American': gichi 'big' + mookomaan 'knife'
Chinese (Cantonese Jyutping):
- 學生 'student': 學 hok6 learn + 生 sang1 grow
- 太空 'universe': 太 taai3 great + 空 hung1 emptiness
- 摩天樓 'skyscraper': 摩 mo1 touch + 天 tin1 sky + 樓 lau2 building (with more than 1 storey)
- 打印機 'printer': 打 daa2 strike + 印 yan3 stamp/print + 機 gei1 machine
- 百科全書 'encyclopaedia': 百 baak3 100 + 科 fo1 (branch of) study + 全 cyun4 entire/complete + 書 syu1 book
- Arbeidsongeschiktheidsverzekering 'disability insurance': arbeid 'labour', + ongeschiktheid 'inaptitude', + verzekering 'insurance'.
- Rioolwaterzuiveringsinstallatie 'wastewater treatment plant': riool 'sewer', + water 'water', + zuivering 'cleaning', + installatie 'installation'.
- Verjaardagskalender 'birthday calendar': verjaardag 'birthday', + kalender 'calendar'.
- Klantenservicemedewerker 'customer service representative': klanten 'customers', + service 'service', + medewerker 'worker'.
- Universiteitsbibliotheek 'university library': universiteit 'university', + bibliotheek 'library'.
- Doorgroeimogelijkheden 'possibilities for advancement': door 'through', + groei 'grow', + mogelijkheden 'possibilities'.
- sanakirja 'dictionary': sana 'word', + kirja 'book'
- tietokone 'computer': tieto 'knowledge, data', + kone 'machine'
- keskiviikko 'Wednesday': keski 'middle', + viikko 'week'
- maailma 'world': maa 'land', + ilma 'air'
- rautatieasema 'railway station': rauta 'iron' + tie 'road' + asema 'station'
- suihkuturbiiniapumekaanikkoaliupseerioppilas: 'Jet engine assistant mechanic NCO student'
- atomiydinenergiareaktorigeneraattorilauhduttajaturbiiniratasvaihde: some part of a nuclear plant
- Wolkenkratzer 'skyscraper': wolken 'clouds', + kratzer 'scraper'
- Eisenbahn 'railway': Eisen 'iron', + bahn 'track'
- Kraftfahrzeug 'automobile': Kraft 'power', + fahren/fahr 'drive', + zeug 'machinery'
- Stacheldraht 'barbed wire': stachel 'barb/barbed', + draht 'wire'
- Rinderkennzeichnungs- und Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz: literally, Cattle marking and beef labeling supervision duties delegation law
- járnbraut 'railway': járn 'iron', + braut 'path' or 'way'
- farartæki 'vehicle': farar 'journey', + tæki 'apparatus'
- alfræðiorðabók 'encyclopædia': al 'everything', + fræði 'study' or 'knowledge', + orða 'words', + bók 'book'
- símtal 'telephone conversation': sím 'telephone', + tal 'dialogue'
- Millepiedi 'centipede': mille 'thousand', + piedi 'feet'
- Ferrovia 'railway': ferro 'iron', + via 'way'
- Tergicristallo 'windscreen wiper': tergere 'to wash', + cristallo 'crystal, (pane of) glass'
- 目覚まし（時計） mezamashi(dokei) 'alarm clock': 目 me 'eye' + 覚まし samashi (-zamashi) 'awakening (someone)' (+ 時計 tokei (-dokei) clock)
- お好み焼き okonomiyaki: お好み okonomi 'preference' + 焼き yaki 'cooking'
- 日帰り higaeri 'day trip': 日 hi 'day' + 帰り kaeri (-gaeri) 'returning (home)'
- 国会議事堂 kokkaigijidō 'national diet building': 国会 kokkai 'national diet' + 議事 giji 'proceedings' + 堂 dō 'hall'
- 안팎 anpak 'inside and outside': 안 an 'inside' + 밖 bak 'outside' (As two nouns compound, the consonant sound 'b' fortifies into 'p,' becoming 안팎 anpak rather than 안밖 anbak)
- Ciencia-ficción 'science fiction': ciencia, 'science', + ficción, 'fiction' (This word is a calque from the English expression science fiction. In English, the head of a compound word is the last morpheme: science fiction. Conversely, the Spanish head is located at the front, so ciencia ficción sounds like a kind of fictional science rather than scientific fiction.)
- Ciempiés 'centipede': cien 'hundred', + pies 'feet'
- Ferrocarril 'railway': ferro 'iron', + carril 'lane'
- Paraguas 'umbrella': para 'to stop, stops' + aguas '(the) water'
- Cabizbajo 'keeping the head low, in a bad mood'
- Subibaja 'seesaw'
In Germanic languages, compound words are formed by prepending a descriptive word in front of the main word. For example, "starfish" is a specific "fish" with a "star" shape. Likewise, the noun phrase "star shape" means a "star"like "shape" (whatever a star is). Whereas "starfish" has an explicit definition, this is not required, as compounds like "star shape" and "starlike" can be composed when needed and understood by their implicit meaning. The compound word is understood as a word in itself. Therefore, it may in turn be used in new compound words, so forming an arbitrarily long word is trivial. This contrasts to Romance languages, where prepositions are more used to specify word relationships instead of concatenating the words.
As a member of the Germanic family of languages, English is special in that compound words are usually written in their separate parts. Although English does not form compound nouns to the extent of Dutch or German, noun phrases like "Girl Scout troop", "city council member", and "cellar door" are arguably compound nouns and used as such in speech. Writing them as separate words is merely an orthographic convention, possibly a result of influence from French.
In the Russian language compounding is a common type of word formation, and several types of compounds exist, both in terms of compounded parts of speech and of the way of the formation of a compound.
Compound nouns may be agglutinative compounds, hyphenated compounds (стол-книга 'folding table' lit. 'table-book', i.e., "book-like table"), or abbreviated compounds (portmanteaux: колхоз 'kolkhoz'). Some compounds look like portmanteaux, while in fact they are an agglutinations of type stem + word: Академгородок 'Akademgorodok' (from akademichesky gorodok 'academic village'). In agglutinative compound nouns, an agglutinating infix is typically used: пароход 'steamship': пар + о + ход. Compound nouns may be created as noun+noun, adjective + noun, noun + adjective (rare), noun + verb (or, rather, noun + verbal noun).
Compound adjectives may be formed either per se, e.g., бело-розовый 'white-pink', or as a result of compounding during the derivation of an adjective from a multiword term: Каменноостровский проспект ([kəmʲɪnnʌʌˈstrovskʲɪj prʌˈspʲɛkt]) 'Stone Island Avenue', a street in St.Petersburg.
Reduplication in Russian language is also a source of compounds.
Sanskrit is very rich in compound formation with seven major compound types and as many as 55 sub-types. The compound formation process is an open-set, and it is not possible to list all Sanskrit compounds in a dictionary. Compounds of two or three words are more frequent, but longer compounds with some running through pages are not rare in Sanskrit literature. Some examples are below (hyphens below show individual word boundaries for ease of reading but are not required in original Sanskrit).
- हिमालय (IAST Himālaya, decomposed as hima-ālaya): Name of the Himalaya mountain range. Literally the abode of snow. A compound of two words and four syllables.
- प्रवर-मुकुट-मणि-मरीचि-मञ्जरी-चय-चर्चित-चरण-युगल (IAST pravara-mukuṭa-maṇi-marīci-mañjarī-caya-carcita-caraṇa-yugala): Literally, O the one whose dual feet are covered by the cluster of brilliant rays from the gems of the best crowns, from the Sanskrit work Panchatantra. A compound of nine words and 25 syllables.
- कमला-कुच-कुङ्कुम-पिञ्जरीकृत-वक्षः-स्थल-विराजित-महा-कौस्तुभ-मणि-मरीचि-माला-निराकृत-त्रि-भुवन-तिमिर (IAST kamalā-kuca-kuṅkuma-piñjarīkṛta-vakṣaḥ-sthala-virājita-mahā-kaustubha-maṇi-marīci-mālā-nirākṛta-tri-bhuvana-timira): Literally O the one who dispels the darkness of three worlds by the shine of Kaustubha jewel hanging on the chest which has been made reddish-yellow by the saffron from the bosom of Kamalā (Lakshmi), an adjective of Rama in the Kakabhushundi Rāmāyaṇa. A compound of 16 words and 44 syllables.
- साङ्ख्य-योग-न्याय-वैशेषिक-पूर्व-मीमांसा-वेदान्त-नारद-शाण्डिल्य-भक्ति-सूत्र-गीता-वाल्मीकीय-रामायण-भागवतादि-सिद्धान्त-बोध-पुरः-सर-समधिकृताशेष-तुलसी-दास-साहित्य-सौहित्य-स्वाध्याय-प्रवचन-व्याख्यान-परम-प्रवीणाः (IAST sāṅkhya-yoga-nyāya-vaiśeṣika-pūrva-mīmāṃsā-vedānta-nārada-śāṇḍilya-bhakti-sūtra-gītā-vālmīkīya-rāmāyaṇa-bhāgavatādi-siddhānta-bodha-puraḥ-sara-samadhikṛtāśeṣa-tulasī-dāsa-sāhitya-sauhitya-svādhyāya-pravacana-vyākhyāna-parama-pravīṇāḥ): Literally the acclaimed forerunner in understanding of the canons of Sāṅkhya, Yoga, Nyāya, Vaiśeṣika, Pūrva Mīmāṃsā, Vedānta, Nārada Bhakti Sūtra, Śāṇḍilya Bhakti Sūtra, Bhagavad Gītā, the Ramayana of Vālmīki, Śrīmadbhāgavata; and the most skilled in comprehensive self-study, discoursing and expounding of the complete works of Gosvāmī Tulasīdāsa. An adjective used in a panegyric of Jagadguru Rambhadracharya. The hyphens show only those word boundaries where there is no Sandhi. On including word boundaries with Sandhis (vedānta=veda-anta, rāmāyaṇa=rāma-ayana, bhāgavatādi=bhāgavata-ādi, siddhānta=siddha-anta, samadhikṛtāśeṣa=samadhikṛta-aśeṣa, svādhyāya=sva-adhyāya), this is a compound of 35 words and 86 syllables.
Although there is no universally agreed-upon guideline regarding the use of compound words in the English language, in recent decades written English has displayed a noticeable trend towards increased use of compounds. Recently, many words have been made by taking syllables of words and compounding them, such as pixel (picture element) and bit (binary digit). This is called a syllabic abbreviation.
There is a trend in Scandinavian languages towards splitting compound words, known in Norwegian as "særskrivingsfeil" (separate writing error). Because the Norwegian language relies heavily on the distinction between the compound word and the sequence of the separate words it consists of, this has dangerous implications. For example "røykfritt" (smokefree, meaning no smoking) has been seen confused with "røyk fritt" (smoke freely).
The German spelling reform of 1996 introduced the option of hyphenating compound nouns when it enhances comprehensibility and readability. This is done mostly with very long compound words by separating them into two or more smaller compounds, like Eisenbahn-Unterführung (railway underpass) or Kraftfahrzeugs-Betriebsanleitung (car manual). Such practice is also permitted in Norwegian (Bokmål and Nynorsk), and encouraged between parts of the word that have very different pronunciation, such as when one part is a loan word or an acronym.
Compounding by language
- Bracketing paradox
- Incorporation (linguistics)
- Multiword expression
- Noun adjunct
- Portmanteau compounds
- Status constructus
- Word formation
- Syllabic abbreviation
- ^ Student Dictionary of Compound Words of the Russian Language(1978) ISBN 0-8285-5190-1
- ^ a b c Kumar, Anil; Mittal, Vipul; Kulkarni, Amba (2010). "Sanskrit Compound Processor". In Jha, Girish Nath (ed.). Sanskrit Computational Linguistics: 4th International Symposium, New Delhi, India, December 10–12, 2010: Proceedings (Volume 6465 of Lecture Notes in Computer Science / Lecture Notes in Artificial Intelligence Series). Springer. pp. 57–69. ISBN 3642175279, 9783642175275.
- ^ Harper, Douglas. "Himalaya". Online Etymology Dictionary. http://www.etymonline.com/index.php?term=Himalaya. Retrieved 2011-07-17.
- ^ Kumar, Animesh (May 23, 2007). "Sruti Krta Rama Stuti". Stutimandal.com. http://www.stutimandal.com/new/poemgen.php?id=209. Retrieved July 1, 2011.
- ^ "Rambhadracharya - Virudavali: Virudavali of Guruji". Shri Tulsi Peeth Seva Nyas. http://jagadgururambhadracharya.org/virudavali. Retrieved July 1, 2011.
- Kortmann, Bernd: English Linguistics: Essentials, Cornelsen, Berlin 2005.
- Plag, Ingo: Word-formation in English, Cambridge University Press, Cambridge 2003.
Wikimedia Foundation. 2010.