Speex

Speex
Filename extension	`.spx`
Internet media type	`audio/x-speex, audio/speex, audio/ogg`
Developed by	Xiph.Org Foundation, Jean-Marc Valin
Type of format	Audio
Contained by	Ogg
Standard(s)	RFC 5574
Website	www.speex.org

libspeex
Developer(s)	Xiph.Org Foundation, Jean-Marc Valin^[1]
Initial release	1.0 / March 2003
Stable release	1.1.12^[2] / February 19, 2006; 5 years ago (2006-02-19)
Preview release	1.2rc1 / July 23, 2008; 3 years ago (2008-07-23)
Operating system	Cross-platform
Type	Audio codec, reference implementation
License	BSD-style license^[3]^[4]
Website	Xiph.org downloads

Speex is a patent-free audio compression format designed for speech and also a free software speech codec that may be used on VoIP applications and podcasts.^[5] It is based on the CELP speech coding algorithm.^[6] Speex claims to be free of any patent restrictions and is licensed under the revised (3-clause) BSD license. It may be used with the Ogg container format or directly transmitted over UDP/RTP.

The Speex designers see their project as complementary to the Vorbis general-purpose audio compression project.

Speex is a lossy format, meaning quality is permanently degraded to reduce file size.

The Speex project was created on February 13, 2002.^[7] The first development versions of Speex were released under LGPL license, but as of version 1.0 beta 1, Speex is released under Xiph's version of the (revised) BSD license.^[8] Speex 1.0 was announced on March 24, 2003, after a year of development.^[9] The last stable version of Speex encoder and decoder is 1.1.12.^[2]

1 Description
- 1.1 Features
2 Applications
3 See also
4 References
5 External links

Description

Unlike many other speech codecs, Speex is not targeted at cellular telephony but rather at Voice over IP (VoIP) and file-based compression. The design goals have been to make a codec that would be optimized for high quality speech and low bit rate. To achieve this the codec uses multiple bit rates, and supports ultra-wideband (32 kHz sampling rate), wideband (16 kHz sampling rate) and narrowband (telephone quality, 8 kHz sampling rate). Since Speex was designed for Voice over IP (VoIP) instead of cell phone use, the codec must be robust to lost packets, but not to corrupted ones. All this led to the choice of Code Excited Linear Prediction (CELP) as the encoding technique to use for Speex.^[6] One of the main reasons is that CELP has long proven that it could do the job and scale well to both low bit rates (as evidenced by DoD CELP @ 4.8 kbit/s) and high bit rates (as with G.728 @ 16 kbit/s). The main characteristics can be summarized as follows:

Free software/open-source, patent and royalty-free.
Integration of narrowband and wideband in the same bit-stream.
Wide range of bit rates available (from 2 kbit/s to 44 kbit/s).
Dynamic bit rate switching and Variable bit-rate (VBR).
Voice Activity Detection (VAD, integrated with VBR) (not working from version 1.2).
Variable complexity.
Ultra-wideband mode at 32 kHz (up to 48 kHz).
Intensity stereo encoding option.

Features

Sampling rate: Speex is mainly designed for three different sampling rates: 8 kHz (the same sampling rate to transmit telephone calls), 16 kHz, and 32 kHz. These are respectively referred to as narrowband, wideband and ultra-wideband.
Quality: Speex encoding is controlled most of the time by a quality parameter that ranges from 0 to 10. In constant bit-rate (CBR) operation, the quality parameter is an integer, while for variable bit-rate (VBR), the parameter is a real (floating point) number.
Complexity (variable): With Speex, it is possible to vary the complexity allowed for the encoder. This is done by controlling how the search is performed with an integer ranging from 1 to 10 in a way similar to the -1 to -9 options to gzip compression utilities. For normal use, the noise level at complexity 1 is between 1 and 2 dB higher than at complexity 10, but the CPU requirements for complexity 10 is about five times higher than for complexity 1. In practice, the best trade-off is between complexity 2 and 4,^[10] though higher settings are often useful when encoding non-speech sounds like DTMF tones, or if encoding is not in real-time.
Variable Bit-Rate (VBR): Variable bit-rate (VBR) allows a codec to change its bit rate dynamically to adapt to the "difficulty" of the audio being encoded. In the example of Speex, sounds like vowels and high-energy transients require a higher bit rate to achieve good quality, while fricatives (e.g. s and f sounds) can be coded adequately with fewer bits. For this reason, VBR can achieve lower bit rate for the same quality, or a better quality for a certain bit rate. Despite its advantages, VBR has three main drawbacks: first, by only specifying quality, there is no guarantee about the final average bit-rate. Second, for some real-time applications like voice over IP (VoIP), what counts is the maximum bit-rate, which must be low enough for the communication channel. Third, encryption of VBR-encoded speech may not ensure complete privacy, as phrases can still be identified, at least in a controlled setting with a small dictionary of phrases,^[11] by analysing the pattern of variation of the bit rate.
Average Bit-Rate (ABR): Average bit-rate solves one of the problems of VBR, as it dynamically adjusts VBR quality in order to meet a specific target bit-rate. Because the quality/bit-rate is adjusted in real-time (open-loop), the global quality will be slightly lower than that obtained by encoding in VBR with exactly the right quality setting to meet the target average bitrate.
Voice Activity Detection (VAD): When enabled, voice activity detection detects whether the audio being encoded is speech or silence/background noise. VAD is always implicitly activated when encoding in VBR, so the option is only useful in non-VBR operation. In this case, Speex detects non-speech periods and encodes them with just enough bits to reproduce the background noise. This is called "comfort noise generation" (CNG). Last version VAD was working fine is 1.1.12, since v 1.2 it has been replaced with simple Any Activity Detection.
Discontinuous Transmission (DTX): Discontinuous transmission is an addition to VAD/VBR operation, that allows to stop transmitting completely when the background noise is stationary. In a file, 5 bits are used for each missing frame (corresponding to 250 bit/s).
Perceptual enhancement: Perceptual enhancement is a part of the decoder which, when turned on, tries to reduce (the perception of) the noise produced by the coding/decoding process. In most cases, perceptual enhancement makes the sound further from the original objectively (signal-to-noise ratio), but in the end it still sounds better (subjective improvement).
Algorithmic delay: Every codec introduces a delay in the transmission. For Speex, this delay is equal to the frame size, plus some amount of "look-ahead" required to process each frame. In narrowband operation (8 kHz), the delay is 30 ms, while for wideband (16 kHz), the delay is 34 ms. These values don't account for the CPU time it takes to encode or decode the frames.

Applications

There is already a large base of applications supporting the Speex codec, from streaming applications like teleconference (e.g. TeamSpeak; many servers prefer Speex due to its good quality), to VoIP systems (e.g. Asterisk), to videogames (e.g. Xbox Live,^[12] Civilization 4) and audio processing applications. Most of these are based on the DirectShow filter or OpenACM codec (e.g. Microsoft NetMeeting) on Microsoft Windows, or Xiph.org's reference implementation, libvorbis, on Linux (e.g. Ekiga). There are also plugins for many audio players. See the plugin and software page on the speex.org site for more details.

The media type for Speex is audio/ogg while contained by Ogg, and audio/speex (previously audio/x-speex) when transported through RTP or without container.

The United States Army's Land Warrior system, designed by General Dynamics, also uses Speex for VoIP on an EPLRS radio designed by Raytheon.

The Ear Bible is a single-ear headphone with a built-in Speex player with 1 GB of flash memory^[13], preloaded with a recording of the New American Standard Bible.

ASL Safety & Security's Linux based VIPA OS software^[14] which is used in long line public address systems and voice alarm systems at major international air transport hubs and rail networks.

The Rockbox project uses Speex for its voice interface. It can also play Speex files on supported players, such as the Apple iPod or the iRiver H10.

The Vernier LabQuest handheld data acquisition device for science education uses Speex for voice annotations created by students and teachers using either the built-in or an external microphone.

The Google Mobile App for iPhone currently incorporates Speex.^[15] It has also been suggested that the new Google voice search iPhone app is using Speex to transmit voice to Google servers for interpretation.^[16]

Adobe Flash Player supports Speex starting with Flash Player 10.0.12.36, released in October 2008.^[17] Because of some bugs in Flash Player, the first recommended version for Speex support is 10.0.22.87 and later. Speex in Flash Player can be used for both kind of communication, through Flash Media Server or P2P. Speex can be decoded or converted to any format unlike Nellymoser audio, which was the only speech format in previous versions of Flash Player.^[18]^[19] Speex can be also used in the Flash Video container format (.flv), starting with version 10 of Video File Format Specification (published in November 2008).^[20]

The JavaSonics ListenUp voice recorder uses Speex to compress voice messages that are recorded in a browser and then uploaded to a web server. Primary applications are language training, transcription and social networking.

Speex is used as the voice compression algorithm in the Siri voice assistance on the iPhone 4S.^[21] Since the speech-to-text occurs on Apple's servers, the speex codec is used to minimize network bandwidth.

References

^ "people.xiph.org - personal webspace of the xiphs - Jean-Marc Valin". Xiph.Org. 2009. http://people.xiph.org/~jm/. Retrieved 2009-09-11.
^ ^a ^b "Speex News". Xiph.Org Foundation. http://www.speex.org/news/. Retrieved 2009-09-01.
^ "The Speex Codec Manual - Speex License". Xiph.Org Foundation. http://www.speex.org/docs/manual/speex-manual/node15.html. Retrieved 2009-09-01.
^ "Sample Xiph.Org Variant of the BSD License". Xiph.Org Foundation. http://www.xiph.org/licenses/bsd/. Retrieved 2009-08-29.
^ Xiph.Org Speex: A Free Codec For Free Speech, Retrieved 2009-09-01
^ ^a ^b Xiph.Org Introduction to CELP Coding, Retrieved 2009-09-01
^ Xiph.org Speex releases - pre-1.0 - NEWS and ChangeLog in speex-0.0.1.tar.gz, Retrieved 2009-09-01
^ Xiph.Org Speex FAQ - Under what license is Speex released?, Retrieved 2009-09-01
^ Xiph.Org (2003-03-24) Speex reaches 1.0; Xiph.Org now a 501(c)(3) Non-Profit Organization, Retrieved 2009-09-01
^ Codec Description
^ http://www.cs.jhu.edu/~fabian/papers/oakland08.pdf Spot me if you can: Uncovering Spoken Phrases in Encrypted VoIP Conversations (Charles V. Wright Lucas Ballard Scott E. Coull Fabian Monrose Gerald M. Masson)
^ As announced by Ralph Giles, the Theora codec maintainer, on LugRadio episode 29
^ [1]
^ IPAM 400: IP Based Intelligent Public Address Amplifier - User Manual
^ http://m.google.com/static/legalnotices.html
^ Deconstructing Google Mobile's Voice Search on the iPhone
^ Adobe (2008) Flash Player 10 Datasheet, Retrieved 2009-09-01
^ AskMeFlash.com (2009-05-10) Speex for Flash, Retrieved on 2009-08-12
^ AskMeFlash.com (2009-05-10) Speex vs Nellymoser, Retrieved on 2009-08-12
^ Adobe Systems Incorporated (November 2008) (PDF). Video File Format Specification, Version 10. Adobe Systems Incorporated. http://www.adobe.com/devnet/flv/pdf/video_file_format_spec_v10.pdf. Retrieved 2009-09-01. ^{[dead link]}
^ [2]

External links

RFC 5574 - RTP Payload Format for the Speex Codec
Official Speex homepage
Plugin & software page
JSpeex is a port of Speex to the Java platform
CSpeex is a port of Speex to the .NET platform based on JSpeex
Speex for flash
RFC 5334 - Ogg Media Types

v · d · eXiph.Org Foundation

Ogg Project	Vorbis · Theora · FLAC · Speex · Tremor · OggUVS · OggPCM · Ogg Writ · CELT

Other projects	XSPF · Annodex · Xiph QuickTime Components · cdparanoia · Icecast · IceShare

Related articles	Chris Montgomery · CMML · Ogg page · Ogg Squish · Use of Ogg formats in HTML5

Multimedia compression and container formats

Video

ISO/IEC	MJPEG · Motion JPEG 2000 · MPEG-1 · MPEG-2 (Part 2) · MPEG-4 (Part 2/ASP · Part 10/AVC) · HEVC

ITU-T	H.120 · H.261 · H.262 · H.263 · H.264 · HEVC

Others	AVS · Bink · CineForm · Cinepak · Dirac · DV · Indeo · Microsoft Video 1 · OMS Video · Pixlet · Prores · RealVideo · RTVideo · SheerVideo · Smacker · Sorenson Video, Spark · Theora · VC-1 · VC-2 · VC-3 · VP3 · VP6 · VP7 · VP8 · WMV

Audio

ISO/IEC	MPEG-1 Layer III (MP3) · MPEG-1 Layer II (Multichannel) · MPEG-1 Layer I · AAC · HE-AAC · MPEG Surround · MPEG-4 ALS · MPEG-4 SLS · MPEG-4 DST · MPEG-4 HVXC · MPEG-4 CELP · USAC

ITU-T	G.711 · G.718 · G.719 · G.722 · G.722.1 · G.722.2 · G.723 · G.723.1 · G.726 · G.728 · G.729 · G.729.1

Others	AC-3 · AMR · AMR-WB · AMR-WB+ · Apple Lossless · Asao · ATRAC · CELT · DRA · DTS · EVRC · EVRC-B · FLAC · GSM-HR · GSM-FR · GSM-EFR · iLBC · iSAC · Monkey's Audio · TTA (True Audio) · MT9 · A-law · μ-law · Musepack · OptimFROG · Opus · OSQ · QCELP · RealAudio · RTAudio · SD2 · SHN · SILK · Siren · SMV · Speex · SVOPC · TwinVQ · VMR-WB · Vorbis · WavPack · WMA

Image

ISO/IEC/ITU-T	JPEG · JPEG 2000 · JPEG XR · Lossless JPEG · JBIG · JBIG2 · PNG · TIFF/EP · TIFF/IT

Others	APNG · BMP · DjVu · EXR · GIF · ICER · ILBM · MNG · PCX · PGF · TGA · QTVR · TIFF · WBMP · WebP

Containers

ISO/IEC	MPEG-PS · MPEG-TS · ISO base media file format · MPEG-4 Part 14 · Motion JPEG 2000 · MPEG-21 Part 9

ITU-T	H.222.0 · T.802

Others	3GP and 3G2 · AMV · ASF · AIFF · AVI · AU · Bink · DivX Media Format · DPX · EVO · Flash Video · GXF · M2TS · Matroska · MXF · Ogg · QuickTime File Format · RealMedia · REDCODE RAW · RIFF · Smacker · MOD and TOD · VOB · WAV · WebM

See Compression methods for methods and Compression software implementations for codecs

Data compression software implementations

Archivers
with compression
(comparison)

Free software	7-Zip · Ark · File Roller · FreeArc · Info-ZIP · Keka · KGB Archiver · PAQ · PeaZip · The Unarchiver (decompression only) · tar · UPX · Xarchiver · Zipeg

Freeware	Filzip · IZArc · LHA · StuffIt Expander (decompression only) · TUGZip · ZipGenius · Bandizip

Commercial	ARC · ALZip · Archive Utility · ARJ · BetterZip · JAR · MacBinary · PKZIP/SecureZIP · PowerArchiver · StuffIt · WinAce · WinRAR · WinZip

Lossless data compression*

Free software

bzip2 · compress · gzip · lzip · lzop · rzip · xz

Audio compression
(comparison)

Lossy	Freeware Advanced Audio Coder (FAAC) · Helix DNA Producer · l3enc · LAME · TooLAME · libavcodec · libcelt · libspeex · Musepack · libvorbis · Windows Media Encoder

Lossless	ALAC · FLAC · libavcodec · Monkey's Audio · mp4als · OptimFROG · Shorten · TTA (True Audio) · WavPack

Video compression
(comparison)

Lossy

MPEG-4 ASP	3ivx · DivX · Nero Digital · FFmpeg · HDX4 · Xvid

H.264/MPEG-4 AVC	CoreAVC · Blu-code · DivX · FFmpeg · Nero Digital · QuickTime · x264

Others	CineForm · Cinepak · DNxHD · Helix DNA Producer · Indeo · libavcodec · Schrödinger (Dirac) · SBC · Sorenson · VP7 · libtheora · libvpx · Windows Media Encoder

Lossless

FFV1 · Huffyuv · Lagarith · MSU Lossless · SheerVideo · YULS

Non-archiving
See also: compression methods and compression formats

Categories:

Xiph.Org projects
Audio codecs
Speech codecs
Free multimedia codecs, containers, and splitters

Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

Speex — Расширение .spx MIME audio/x speex Разработан Xiph.Org Foundation Тип формата Аудиокодек … Википедия
Speex — Extension .spx Développé par Xiph.org Type de format Format audio Contenu par Ogg Standard(s) (en) … Wikipédia en Français
Speex — Vorlage:Infobox Dateiformat/Wartung/MagischeZahl fehltVorlage:Infobox Dateiformat/Wartung/Website fehlt Speex Dateiendung: .spx MIME Type: audio/x speex … Deutsch Wikipedia
Speex — Este artículo o sección necesita referencias que aparezcan en una publicación acreditada, como revistas especializadas, monografías, prensa diaria o páginas de Internet fidedignas. Puedes añadirlas así o avisar al autor princip … Wikipedia Español
Speex — El proyecto Speex tiene como objetivo crear un códec para voz libre, sin restricciones de ninguna patente de software. Speex está sujeto a la Licencia BSD y se usa con el contenedor Ogg de la Fundación Xiph.org. Los diseñadores de Speex ven su… … Enciclopedia Universal
Speex Audio Codec — Saltar a navegación, búsqueda Speex Audio Codec es un software libre de códec de voz que puede ser usado en aplicaciones de Voz sobre IP y podcasting. Afirma ser libre de patentes de restricción y bajo licenciamiento de revisión, licencia BSD.… … Wikipedia Español
Ogg Speex — … Википедия
Ogg — For other uses, see Ogg (disambiguation). Ogg Filename extension .ogv, .oga, .ogx, .ogg, .spx, Internet media type video/ogg, audio/ogg, application/ogg … Wikipedia
Mohawk Voice — Developer(s) Velocity Servers Inc Operating system Client: XP, Vista, 7 Server: Windows Server 2003, Windows Server 2008 Platform 32 bit a … Wikipedia
Comparison of audio formats — The following tables compare general and technical information for a variety of audio formats and audio compression formats. For listening tests comparing the perceived audio quality of audio formats and codecs, see the article Codec listening… … Wikipedia

Academic Dictionaries and Encyclopedias

Speex

Contents

Description

Features

Applications

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Speex

Contents

Description

Features

Applications

See also

References

External links

Look at other dictionaries:

Share the article and excerpts

Direct link