- XHTML+Voice
XHTML+Voice (commonly X+V) is an
XML language for describing multimodal user interfaces. The two essential modalities are visual and auditory. Visual interaction is defined like most current web pages viaXHTML . Auditory components are defined by a subset ofVoice XML . Interfacing the voice and visual components of X+V documents is accomplished through a combination ofECMAScript ,JavaScript , andXML Events .Voice input
Voice input or
speech recognition is based on grammars that define the set of possible input text. In contrast to a probabilistic approach employed by popular software packages such asDragon Naturally Speaking , the grammar based approach provides the recognizer with important contextual information that significantly boosts recognition accuracy. The specific formats for grammars includeJSGF .Voice output
Voice output or
speech synthesis can read any string at virtually any time. Pitch, volume, and other charactaristics can be customized usingCSS andSpeech Synthesis Markup Language (SSML) however the Opera web browser doesn't currently support all these features.MIME types
The previously recommended MIME type for any X+V document is application/xhtml+voice+xml which is what the
Opera browser uses. Opera will also interpret X+V documents served as text/xml. The current recommended MIME type for any X+V document is application/xv+xml. Since most web servers associate the .xml extension with text/xml, an xml extension is a fairly safe way of making your static X+V document files browsable.X+V-enabled browsers
The most commonly used X+V browser is the
Opera browser . Users of theOpera browser can enable X+V support through steps described at [http://www.opera.com/voice/ http://www.opera.com/voice/] . Voice is not yet supported inOpera Mini or on platforms other than Windows.Detecting support for X+V is best done from the server by checking the HTTP header "Accept" for the MIME type application/xhtml+voice+xml. Here is some PHP code that returns "true" if and only if the requesting browser supports XHTML+Voice: 0)) { echo "true"; } } else echo "false"; ?>
Related Technology
Speech Application Language Tags (SALT) is a very similar format developed byMicrosoft in 2001 to compete withVoiceXML and XHTML+Voice. SALT also provides users with multimodal support including grammar based recognition and speech synthesized output. The main differences are in the providers of support. Many different companies support VoiceXML and XHTML+Voice by providing various development tools and in particularIBM andOpera Software . SALT is supported almost exclusively fromMicrosoft by products such as the Microsoft Speech Application SDK andMicrosoft Speech Server .External links
* [http://www.voicexml.org/specs/multimodal/x+v/12/ XHTML+Voice v1.2]
* [http://dev.opera.com/articles/voice/ Voice - Opera Developer Community]
* [ftp://ftp.software.ibm.com/software/pervasive/info/multimodal/XHTML_voice_programmers_guide.pdf XHTML+Voice Programmer's Guide]
* [http://www.opera.com/download/ Download Opera Web Browser]
* [http://davinci.newcs.uwindsor.ca/~speechweb/movie.mov Video demonstration using XHTML+Voice]
* [http://cs.uwindsor.ca/~speechweb/ The SpeechWeb Project]
* [http://www.apps.ietf.org/rfc/rfc4374.txt RFC 4374 on MIME type]
Wikimedia Foundation. 2010.