Pronunciation Lexicon Specification

Pronunciation Lexicon Specification

The Pronunciation Lexicon Specification (PLS) is a W3C Specification, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing applications. The language is intended to be easy to use by developers while supporting the accurate specification of pronunciation information for international use.

The language allows one or more pronunciations for a word or phrase to be specified using a standard pronunciation alphabet or if necessary using vendor specific alphabets. Pronunciations are grouped together into a PLS document which may be referenced from other markup languages, such as the Speech Recognition Grammar Specification SRGS and the Speech Synthesis Markup Language SSML.

Usage

Here is an example PLS document:

judgment judgement IPA|ˈdʒʌdʒ.mənt fiancé fiance IPA|fiˈɒns.eɪ IPA|ˌfiː.ɑːnˈseɪ

which could be used to improve TTS as shown in the following SSML 1.0 document:

<p> In the judgement of my fiancé, Las Vegas is the best place for a honeymoon. I replied that I preferred Venice and didn't think the Venetian casino was an acceptable compromise.<p>

but also to improve ASR in the following SRGS 1.0 grammar:

Terminator 2: Judgment Day My Big Fat Obnoxious Fiance Pluto's Judgement Day

Common Use Cases

Multiple pronunciations for the same orthography

For ASR systems it is common to rely on multiple pronunciations of the same word or phrase in order to cope with variations of pronunciation within a language. In the Pronunciation Lexicon language, multiple pronunciations are represented by more than one <phoneme> (or <alias>) element within the same <lexeme> element.

In the following example the word "Newton" has two possible pronunciations.

Newton IPA|ˈnjuːtən IPA|ˈnuːtən

Multiple orthographies

In some situations there are alternative textual representations for the same word or phrase. This can arise due to a number of reasons. See Section 4.5 of PLS for details. Because these are representations that have the same meaning (as opposed to homophones), it is recommended that they be represented using a single <lexeme> element that contains multiple graphemes.

Here are two simple examples of multiple orthographies: alternative spelling of an English word and multiple writings of a Japanese word.

colour color IPA|ˈkʌlər

<!-- Japanese entry showing how multiple writing systems are handled romaji, kanji and hiragana orthographies --> nihongo 日本語 にほんご IPA|ɲihoŋo

Homophones

Most languages have homophones, words with the same pronunciation but different meanings (and possibly different spellings), for instance "seed" and "cede". It is recommended that these be represented as different lexemes.

cede IPA|siːd seed IPA|siːd

Homographs

Most languages have words with different meanings but the same spelling (and sometimes different pronunciations), called homographs. For example, in English the word bass (fish) and the word bass (in music) have identical spellings but different meanings and pronunciations. Although it is recommended that these words be represented using separate <lexeme> elements that are distinguished by different values of the role attribute (see Section 4.4 of PLS 1.0), if a pronunciation lexicon author does not want to distinguish between the two words they could simply be represented as alternative pronunciations within the same element. In the latter case the TTS processor will not be able to distinguish when to apply the first or the second transcription.

In this example the pronunciations of the homograph "bass" are shown.

bass IPA|bæs IPA|beɪs

Note that English contains numerous examples of noun-verb pairs that can be treated either as homographs or as alternative pronunciations, depending on author preference. Two examples are the noun/verb "refuse" and the noun/verb "address".

refuse IPA|rɪˈfjuːz refuse IPA|ˈrefjuːs

Pronunciation by Orthography (Acronyms, Abbreviations, etc.)

For some words and phrases pronunciation can be expressed quickly and conveniently as a sequence of other orthographies. The developer is not required to have linguistic knowledge, but instead makes use of the pronunciations that are already expected to be available. To express pronunciations using other orthographies the <alias> element may be used.

This feature may be very useful to deal with acronym expansion.

<!-- Acronym expansion --> W3C World Wide Web Consortium <!-- number representation --> 101 one hundred and one <!-- crude pronunciation mechanism --> Thailand tie land <!-- crude pronunciation mechanism and acronym expansion --> BBC 1 be be sea one

tatus and Future

*PLS 1.0 has not yet reached the status of W3C Recommendation. The current version is Candidate Recommendation (12 December 2007). The publication of the document as a Proposed Recommendation is expected soon.

References

* [http://www.w3.org/TR/pronunciation-lexicon/ PLS Specification (W3C Working Draft)]

ee also

* VoiceXML
* SRGS
* SSML
* SISR

External links

* [http://www.w3.org/TR/pronunciation-lexicon/ PLS Specification (W3C Working Draft)]
* [http://www.w3.org/TR/speech-grammar/ SRGS Specification (W3C Recommendation)]
* [http://www.w3.org/TR/speech-synthesis/ SSML Specification (W3C Recommendation)]
* [http://www.voicexml.org VoiceXML Forum]
* [http://www.orange.com/en_EN/innovation/patents_licensing/Software/PLS.html France Telecom Orange Labs implementation of PLS 1.0 under the Gnu General Public License version 3]
* [http://sourceforge.net/projects/openpls Sourgeforce project for Java-based implementation of PLS 1.0]


Wikimedia Foundation. 2010.

Игры ⚽ Нужен реферат?

Look at other dictionaries:

  • Speech Recognition Grammar Specification — (SRGS) is a W3C standard for how speech recognition grammars are specified. A speech recognition grammar is a set of word patterns, and tells a speech recognition system what to expect a human to say. For instance, if you call a voice directory… …   Wikipedia

  • VoiceXML — (VXML) is the W3C s standard XML format for specifying interactive voice dialogues between a human and a computer. It allows voice applications to be developed and deployed in an analogous way to HTML for visual applications. Just as HTML… …   Wikipedia

  • Voice browser — A voice browser is a web browser that presents an interactive voice user interface to the user. In addition, it typically provides an interface to the PSTN or a PBX. Just as a visual web browser works with HTML pages, a voice browser operates on… …   Wikipedia

  • PLS — or Pls may mean:* Papillon Lefevre syndrome a disease affecting the teeth and skin * Liberal Party of Switzerland (Partito Liberale Svizzero) * Polish Volleyball League ( Polska Liga Siatkówki ) * Partial least squares (statistics) * Palletized… …   Wikipedia

  • linguistics — /ling gwis tiks/, n. (used with a sing. v.) the science of language, including phonetics, phonology, morphology, syntax, semantics, pragmatics, and historical linguistics. [1850 55; see LINGUISTIC, ICS] * * * Study of the nature and structure of… …   Universalium

  • phonetics — /feuh net iks, foh /, n. (used with a sing. v.) 1. the science or study of speech sounds and their production, transmission, and reception, and their analysis, classification, and transcription. Cf. acoustic phonetics, articulatory phonetics,… …   Universalium

  • HEBREW LANGUAGE — This entry is arranged according to the following scheme: pre biblical biblical the dead sea scrolls mishnaic medieval modern period A detailed table of contents precedes each section. PRE BIBLICAL nature of the evidence the sources phonology… …   Encyclopedia of Judaism

  • Cornish language — For the Anglo Cornish accent and dialect, see Anglo Cornish. Cornish Kernowek, Kernewek Pronunciation [kərˈnuːək] Spoken in …   Wikipedia

  • The Syro-Aramaic Reading of the Koran —   …   Wikipedia

  • Microsoft Speech API — This article is about the Speech API. For other uses, see SAPI (disambiguation). The Speech Application Programming Interface or SAPI is an API developed by Microsoft to allow the use of speech recognition and speech synthesis within Windows… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”