- CELT
-
CELT Developed by Xiph.Org Foundation Type of format Audio Contained by Ogg Extended to Opus Standard(s) Documentation libcelt Developer(s) Xiph.org Foundation, Jean-Marc Valin Preview release 0.11.1 / February 15, 2011 Operating system Cross-platform Type Audio codec, reference implementation License 2-clause BSD (free software) Website celt-codec.org Constrained Energy Lapped Transform (CELT) is an open, royalty-free audio compression format and a free software codec with especially low algorithmic delay for use in low-latency audio communication. It is a lossy codec, meaning quality is permanently degraded to reduce file size. The algorithms are openly documented and may be used free of software patent restrictions. It is being developed by the Xiph.Org Foundation (as part of the Ogg codec family) and the codec working group of the Internet Engineering Task Force (IETF). Reference implementation is a software library named libcelt, that is written in the programming language C and published as free software under Xiph's own 3-clause BSD-ish license.
CELT is meant to bridge the gap between Vorbis and Speex for applications where both high quality audio and low delay are desired.[1] It is suitable to carry both speech and music. It borrows ideas from the CELP algorithm, but avoids some of its limitations by operating in the frequency domain exclusively.[1]
The first development version of CELT was published in December 2007.[2]
Contents
Properties
CELT can use sampling rates from 32 kHz to 48 kHz and above, adaptive bit-rate from 32 kbit/s to 128 kbit/s per channel and above. CELT supports mono and stereo and it is applicable to both speech and music. It uses ultra-low algorithmic delay (as low as 2 ms; scalable, typically from 3 to 9 ms). There are no known intellectual property issues and it is permissive open-source licensed under the 2-clause BSD.[3][1]
The goal is a codec for real-time applications. Therefore, the central feature is low algorithmic delay. CELT allows for latencies of typically three to nine, but configurable to below two milliseconds at the price of more bitrate to reach a similar audio quality.[4] CELT undercuts the latencies that are possible with other commonly used codecs.
Like its sister project Vorbis it is a fullband (entire human hearing range) general-purpose codec, i.e. not specialised for special types of audio signals and therefore different from its other sister project Speex. It processes audio signals with sampling rates between 32 and 96 kHz and up to two channels (stereophonic sound). Therefore the format basically enables for transparent results, as well as for bitrates down to 24 kBit/s.[4] All in all, the compression capabilities are said to be significantly superior to those of the MP3 format. As another useful feature for realtime applications like telephony, CELT performs very well at low bitrates. The audio quality there is said to be superior to Vorbis and even on par with HE-AACv1, thanks to the band folding.[5][6] In comparative double-blind listening tests it proved to be noticeably superior to HE-AACv1 at ~64 kBit/s.[7]
It has a comparably low computational complexity that resembles that of the low-delay variant of AAC (AAC-LD) and stays significantly below the complexity of Vorbis.[8]
It enables for constant and variable bitrate. If the signal disappears into the noise floor in speech pauses and similar cases, the transmission can be limited to signal the output of comfort noise to the decoder. Most settings of the naturally streaming-enabled format can be changed on the fly without interrupting transmission.
The format is robust to transmission errors. Loss of whole packets as well as bit errors can be masked with a steady degradation of audio quality (packet loss concealment, PLC).
Technology
CELT is a transform codec based on the modified discrete cosine transform (MDCT) and concepts from CELP (with a code book for excitation, but in the frequency domain).
The initial PCM-coded signal is being cut into relatively small, overlapping blocks for the MDCT (window function) and transformed to frequency coefficients. Choosing an especially short block size on the one hand enables for a low latency, but also leads to poor frequency resolution that has to be compensated. For a further reduction of the algorithmic delay to the expense of a minor sacrifice in audio quality, the by nature 50 % of overlap between the blocks is practically cut down to half by silencing the signal during one eight at both ends of a block, respectively.[4]
The coefficients are grouped to resemble the critical bands of the human auditory system. The entire amount of energy of each group is analysed and the values quantised for data reduction and compressed through prediction by only transmitting the difference to the predicted values (delta encoding).
The (unquantised) band energy values are removed from the raw DCT coefficients (normalisation). The coefficients of the resulting residual signal (so-called “band shape”) are coded by Pyramid Vector Quantisation (PVQ, a spherical vector quantisation)[9]. This encoding leads to code words of fixed (predictable) length, which in turn enables for robustness against bit errors and leaves no need for entropy encoding.[6] Finally, all output of the encoder are coded to one bitstream by a range encoder.[10] In connection with the PVQ, CELT uses a technique known as band folding, is said to deliver a similar effect to the spectral band replication (SBR) by reusing coefficients of lower bands for higher ones, while at the same time it has much less implications on the algorithmic delay and computational complexity than the SBR. This works against “birdie” artifacts by preserving more richness in the appropriate frequency bands.
The decoder unpacks the individual components from the range coded bitstream, multiplies the band energy to the band shape coefficients and transforms them back (via iMDCT) to PCM data. The individual blocks are rejoined using weighted overlap-add (WOLA). Many parameters are not explicitly coded, but instead reconstructed by using the same functions as the encoder.
For the channel coupling CELT may use M/S stereo or intensity stereo. Blocks can be described independent from adjacent frames (Intra-frame); for example to enable a decoder to jump into a running stream. With transform codecs so-called pre-echo artifacts can get audible, because the quantisation error of sharp, energy-heavy sounds (transients) can spread over the entire DCT block and the transient doesn't mask them backward in time as well as forward. With CELT each block can be further divided to thwart such artifacts.
History
First work on plans and drafts for a Vorbis successor was done in 2005 at Xiph as part of the Ghost project (initially talked about as “Vorbis II”). Besides the codec plans of Vorbis creator Christopher Montgomery, that are on halt in favour of Theora development, this also led to Jean-Marc Valin′s concept of a particularly low-latency codec. Valin is working on CELT since 2007 and on 29. November he entered first code in the repository of the project.[6] In December 2007 the first developers version 0.0.1 got published, first named “Code-Excited Lapped Transform”.[11] CELT is a proposal for a free codec standard for telecommunication over the internet at the IETF since Juli 2009[12][13][3][14], thereby now also involving the codec working group of the IETF in the development. In May 2009, a draft of RTP payload format for the CELT Codec was published.[15]
As of version 0.9, the pitch prediction operating in the frequency domain, that was used so far, was replaced by a less complex solution with a pre- and postfilter pair in time domain,[16] that was contributed by Raymond Chen of Broadcom.[6]
With CELT 0.11 from February 4, 2011 the format was tentatively frozen (“soft freeze”) – reserving the possibility of unexpectedly necessary last changes.
Despite the format not being finally frozen it is being used in the VoIP applications Ekiga and FreeSWITCH since January 2009 and meanwhile also Mumble, TeamSpeak and other[17] software.
Shortly after the advent of the hybrid codec Opus (formerly known as “Harmony”), the development of CELT as a separate project was halted, instead it lives on as basis of Opus and is now being developed as a part of this successor project. Opus represents a superset to CELT and the speech codec SILK,[18] in which the CELT algorithms are either not used, used on their own, or used in hybrid with the SILK algorithms treating the lower part of the spectral range and the CELT algorithms being used for the high part of the frequency range. The appropriate draft is at the IETF since September 2010.
In April support for CELT was included in FFmpeg.[19][20]
Software
In January 2009 support for CELT was added to the Ekiga[21] and FreeSWITCH[22] VoIP programs.
CELT is also supported or used by:[23]
- Gablarski[24]
- GStreamer
- jack-audio-connection-kit (netjack)
- liboggz
- Mumble (starting with version 1.2)
- NexGenVoIP
- Radio CHNC
- RoarAudio
- SFLphone
- Soundjack
- TeamSpeak 3
- SPICE
See also
External links
References
- ^ a b c Xiph.Org The CELT ultra low-delay audio codec - home page, Retrieved 2009-09-01
- ^ Xiph.Org (2007-12-08) CELT releases – celt-0.0.1.tar.gz, Retrieved 2009-09-01
- ^ a b CELT IETF draft
- ^ a b c presentation of the codec by Timothy B. Terriberry (65 minutes of video in ~100 MiB OggTheora+Vorbis, see also presentation slides in PDF, ~2,3 MiB)
- ^ Jason Garrett-Glaser: Important: upcoming CELT bitstream freeze!. In: ffmpeg-devel.mplayerhq.hu - FFmpeg development discussions and patches mailing list. mplayerhq.hu, 2010-11-18. Retrieved on 2011-01-25. (Englisch)
- ^ a b c d Christopher Montgomery: next generation audio: CELT update 20101223. In: Monty's demo pages. Xiph.Org, 2010-12-23. Retrieved on 2011-01-26. (en)
- ^ Dirk Bösel: CELT beeindruckt beim 64 kb/s Multiformat Hörtest (2011). In: MPeX.net. MPeX.net GmbH, 2011-04-18. Retrieved on 2011-04-25.
- ^ Jean-Marc Valin, Timothy B. Terriberry, Christopher Montgomery, Gregory Maxwell: A High-Quality Speech and Audio Codec With Less Than 10 ms Delay. In: IEEE Signal Processing Society (ed.): IEEE Transactions on Audio, Speech and Language Processing. 18, No. 1, 19 November 2011 (http://people.xiph.org/~jm/papers/celt_tasl.pdf ; as at: 2011-02-16).
- ^ Thomas R. Fischer: A pyramid vector quantizer. In: IEEE (ed.): IEEE Transactions on Information Theory. 32, No. 4, 19 November 2011.
- ^ second version of the draft of the specification
- ^ Jean-Marc Valin: Experimental release of Ghost/CELT 0.0.1. In: Hydrogenaudio Forums. 2007-12-09. Retrieved on 2011-01-26. (Englisch)
- ^ Monika Ermert: IETF kümmert sich um lizenzfreien Audiocodec. In: heise online. 2009-11-13. Retrieved on 2011-02-12.
- ^ first draft of the specification submitted to the IETF
- ^ IETF - AVT Working Group (2009-07-04) Constrained-Energy Lapped Transform (CELT) Codec, Retrieved 2009-09-01
- ^ IETF - AVT Working Group (2009-05-08) RTP Payload Format for the CELT Codec, Retrieved 2009-09-01
- ^ Jean-Marc Valin: CELT decoder complexity. In: CELT-dev-Mailingliste. Xiph.Org, 2011-02-15. Retrieved on 2011-02-16. (Englisch)
- ^ Software that uses or supports CELT. In: CELT-Website. Xiph.Org. Retrieved on 2011-01-25. (Englisch)
- ^ Jean-Marc Valin, Koen Vos: Definition of the Opus Audio Codec. In: IETF Internet-Drafts. IETF Network Working Group, 2010-10. Retrieved on 2011-01-25. (Englisch)
- ^ http://ffmpeg.org/pipermail/ffmpeg-devel/2011-April/110850.html
- ^ http://git.videolan.org/?p=ffmpeg.git;h=89451dd6e4da40ed73b8bbee2d48d8d8be1d5b0c
- ^ Ekiga 3.1.0 available
- ^ FreeSWITCH: New Release For The New Year
- ^ Software that uses or supports CELT
- ^ www.gablarski.org
Xiph.Org Foundation Ogg Project Other projects Related articles Multimedia compression and container formats Video OthersAudio MPEG-1 Layer III (MP3) · MPEG-1 Layer II (Multichannel) · MPEG-1 Layer I · AAC · HE-AAC · MPEG Surround · MPEG-4 ALS · MPEG-4 SLS · MPEG-4 DST · MPEG-4 HVXC · MPEG-4 CELP · USACOthersAC-3 · AMR · AMR-WB · AMR-WB+ · Apple Lossless · Asao · ATRAC · CELT · DRA · DTS · EVRC · EVRC-B · FLAC · GSM-HR · GSM-FR · GSM-EFR · iLBC · iSAC · Monkey's Audio · TTA (True Audio) · MT9 · A-law · μ-law · Musepack · OptimFROG · Opus · OSQ · QCELP · RealAudio · RTAudio · SD2 · SHN · SILK · Siren · SMV · Speex · SVOPC · TwinVQ · VMR-WB · Vorbis · WavPack · WMAImage OthersContainers ISO/IECITU-TOthersSee Compression methods for methods and Compression software implementations for codecsData compression software implementations Archivers
with compression
(comparison)7-Zip · Ark · File Roller · FreeArc · Info-ZIP · Keka · KGB Archiver · PAQ · PeaZip · The Unarchiver (decompression only) · tar · UPX · Xarchiver · ZipegARC · ALZip · Archive Utility · ARJ · BetterZip · JAR · MacBinary · PKZIP/SecureZIP · PowerArchiver · StuffIt · WinAce · WinRAR · WinZipLossless data compression* Audio compression
(comparison)Freeware Advanced Audio Coder (FAAC) · Helix DNA Producer · l3enc · LAME · TooLAME · libavcodec · libcelt · libspeex · Musepack · libvorbis · Windows Media EncoderALAC · FLAC · libavcodec · Monkey's Audio · mp4als · OptimFROG · Shorten · TTA (True Audio) · WavPackVideo compression
(comparison)OthersCineForm · Cinepak · DNxHD · Helix DNA Producer · Indeo · libavcodec · Schrödinger (Dirac) · SBC · Sorenson · VP7 · libtheora · libvpx · Windows Media Encoder- Non-archiving
See also: compression methods and compression formats
Categories:- Xiph.Org projects
- Audio codecs
- Speech codecs
- Free multimedia codecs, containers, and splitters
Wikimedia Foundation. 2010.