- Speech recognition in Linux
There is currently no open-source equivalent of proprietary
speech recognition software (e.g. Nuances Dragon NaturallySpeaking orWindows Speech Recognition ) for Linux. However, there are several incomplete, open-source projects and solutions that could be used to attain some elements of speech recognition in the free operating system. It is also possible to use Windows speech recognition software under Linux.Native Linux speech recognition
History
In the late 1990s, a Linux version of
ViaVoice (created byIBM ) was made available to users for no charge. However, the free SDK was later removed by the developer in 2002.Current development status
Recently, there has been a push to get a high-quality native Linux speech recognition engine developed. As a result, numerous projects dedicated to creating Linux speech recognition solutions (that are equivalent to current Windows solutions) were established. One major hurdle is the compilation of a
speech corpus to enable production of acoustic models. In response,VoxForge , which aims to collect transcribed speech for the use with free and open-source speech recognition engines under the GPL license, was set up.Ubuntu is currently gathering ideas for implementing speech recognition. [ [https://wiki.ubuntu.com/SpeechRecognition SpeechRecognition - Ubuntu Wiki ] ] .
Solutions
The following is a list of current projects dedicated to implementing speech recognition in Linux, as well as major (though mostly incomplete) native solutions that are available as of March 2008:
*
VoxForge
*Julius
*CMU Sphinx
*HTK (copyrighted by Microsoft, though source code is available for personal use)
* [http://xvoice.sourceforge.net/ Xvoice] (requires ViaVoice to function)
* [http://freespeech.sourceforge.net/ Open Mind Speech]
* [http://live.gnome.org/GnomeVoiceControl GnomeVoiceControl]
* [http://simon-listens.org/ Simon] (This project aims at helping blind people; requires Julius)It is possible, though complicated, for advanced developers to create Linux speech recognition software by using existing packages derived from open-source projects.
Voice control and keyboard shortcuts
Speech recognition usually refers to software that attempts to distinguish thousands of words in a human language.
Voice control may refer to software used for sending operational commands to a computer or appliance. Voice control typically requires a much smaller vocabulary and thus is much easier to implement.Simple software combined with
keyboard shortcuts , have the earliest potential for practically accurate voice control in Linux. Keyboard shortcuts can be used to control many Linux programs.GNOME andKDE have extensive and easily reconfigurable keyboard shortcuts for most tasks.Mozilla Firefox has an Add-on called [https://addons.mozilla.org/en-US/firefox/addon/879 MouselessBrowsing] , allowing links and input boxes to be quickly selected from the keyboard.Running Windows speech recognition software with Linux
Using a compatiblity layer
It is possible to use programs such as
Dragon NaturallySpeaking 9 in Linux by utilizing Wine, though some problems will arise [ [http://appdb.winehq.org/objectManager.php?sClass=version&iId=5402 Dragon NaturallySpeaking 9 - Wine Application Database] ] .Using virtualized Windows
Using no-cost virtualization software, it is possible to run Windows and NaturallySpeaking under Linux [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#Running_Windows.2FDNS_in_a_virtual_machine Running Windows/DNS in a virtual machine - Lumeniki] ] .
VMware Server or VirtualBox support copy and paste to/from a virtual machine, making dictated text easily transferable to/from the virtual machine. Note that problems (such as sound input errors [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#Sound_input_problems Sound input problems with DNS in a virtual machine - Lumeniki] ] ) may occur.WinDictator
[http://foss.eepatents.com/trac/WinDictator/wiki WinDictator] is able to send keystrokes from Windows dictation software (running on a real or virtual Windows machine) to Linux, but installing it may require advanced skills [ [http://scratchpad.wikia.com/wiki/Speech_recognition_on_free_operating_systems#WinDictator WinDictator - Lumeniki] ] . This would offer more functionality than using plain virtualization (or a compatibility layer) if it allows
keyboard shortcuts .Combining native software with Windows software
If native Linux voice control software could be used simultaneously with Windows speech recognition software (running under Linux) much of the same functionality offered by Windows speech recognition software would be available to Linux.
See also
*
Speech recognition
*List of speech recognition software References
External links
* [http://linux-sound.org/speech.html Speech Synthesis & Analysis Software]
* [http://raphaelnunes.wordpress.com/2007/06/16/gnome-voice-control-demonstration/ Gnome Voice Control (an incomplete speech recognition solution for GNOME) - Demonstration]
* [http://tldp.org/HOWTO/Speech-Recognition-HOWTO/software.html Speech Recognition Software - list of speech recognition projects and solutions in Linux]
Wikimedia Foundation. 2010.