Acoustic landmarks and distinctive features

Acoustic landmarks and distinctive features

Kenneth N. Stevens and his colleagues at MIT proposed a model of speech perception that is called "acoustic landmarks and distinctive features".

In this model, the incoming acoustic signal is believed to be first processed to determine the so-alled landmarks which are special spectral events in the signal; for example, vowels are typically marked by higher frequency of the first formant, consonants can be specified as discontinuities in the signal and have lower amplitudes in lower and middle regions of the spectrum. These acoustic features result from articulation. In fact, secondary articulatory movements may be used when enhancement of the landmarks is needed due to external conditions such as noise. Stevens claims that coarticulation causes only limited and moreover systematic and thus predictable variation in the signal which the listener is able to deal with. Within this model therefore, what is called the lack of invariance is simply claimed not to exist.

Landmarks are analyzed to determine certain articulatory events (gestures) which are connected with them. In the next stage, acoustic cues are extracted from the signal in the vicinity of the landmarks by means of mental measuring of certain parameters such as frequencies of spectral peaks, amplitudes in low-frequency region, or timing.

The next processing stage comprises acoustic-cues consolidation and derivation of distinctive features. These are binary categories related to articulation (for example [+/- high] , [+/- back] , [+/- round lips] for vowels; [+/- sonorant] , [+/- lateral] , or [+/- nasal] for consonants.

Bundles of these features uniquely identify speech segments (phonemes, syllables, words). These segments are part of the lexicon which is stored in the listener’s memory. Its units are activated in the process of lexical access and mapped on the original signal to find out whether they match. If not, another attempt with a different candidate pattern is made. In this iterative fashion, listeners thus reconstruct the articulatory events which were necessary to produce the perceived speech signal. This can be therefore described as analysis-by-synthesis.

This theory thus posits that the distal object of speech perception are the articulatory gestures underlying speech. Listeners make sense of the speech signal by making reference to them. The model belongs to those referred to as analysis-by-synthesis.

Bibliography

Stevens, K. N. (2002). "Toward a model of lexical access based on acoustic landmarks and distinctive features." "Journal of the Acoustical Society of America", 111(4), 1872-1891.


Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Acoustic phonetics — is a subfield of phonetics which deals with acoustic aspects of speech sounds. Acoustic phonetics investigates properties like the mean squared amplitude of a waveform, its duration, its fundamental frequency, or other properties of its frequency …   Wikipedia

  • Speech perception — is the process by which the sounds of language are heard, interpreted and understood. The study of speech perception is closely linked to the fields of phonetics and phonology in linguistics and cognitive psychology and perception in psychology.… …   Wikipedia

  • Vowel — In phonetics, a vowel is a sound in spoken language, such as English ah! IPA| [ɑː] or oh! IPA| [oʊ] , pronounced with an open vocal tract so that there is no build up of air pressure at any point above the glottis. This contrasts with consonants …   Wikipedia

  • Timeline of Russian inventions and technology records — The Hall of Space Technology in the Tsiolkovsky State Museum of the History of Cosmonautics, Kaluga, Russia. The exhibition includes the models and replicas of the following Russian inventions: the first satellite, Sputnik 1 (a ball under the… …   Wikipedia

  • Vocal — Las vocales (no sordas) son mucho más visibles en un espectrograma que la mayoría de consonantes, porque su emisión va acompañada de la emisión de mayor energía sonora. En fonética, una vocal o monoptongo es un sonido de una lengua natural… …   Wikipedia Español

  • Architecture and Civil Engineering — ▪ 2009 Introduction Architecture       For Notable Civil Engineering Projects in work or completed in 2008, see Table (Notable Civil Engineering Projects (in work or completed, 2008)).        Beijing was the centre of the world of architecture… …   Universalium

  • Life Sciences — ▪ 2009 Introduction Zoology       In 2008 several zoological studies provided new insights into how species life history traits (such as the timing of reproduction or the length of life of adult individuals) are derived in part as responses to… …   Universalium

  • performing arts — arts or skills that require public performance, as acting, singing, or dancing. [1945 50] * * * ▪ 2009 Introduction Music Classical.       The last vestiges of the Cold War seemed to thaw for a moment on Feb. 26, 2008, when the unfamiliar strains …   Universalium

  • Western architecture — Introduction       history of Western architecture from prehistoric Mediterranean cultures to the present.       The history of Western architecture is marked by a series of new solutions to structural problems. During the period from the… …   Universalium

  • NEW YORK CITY — NEW YORK CITY, foremost city of the Western Hemisphere and largest urban Jewish community in history; pop. 7,771,730 (1970), est. Jewish pop. 1,836,000 (1968); metropolitan area 11,448,480 (1970), metropolitan area Jewish (1968), 2,381,000… …   Encyclopedia of Judaism

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”