- Shannon's source coding theorem
In
information theory , Shannon's source coding theorem (or noiseless coding theorem) establishes the limits to possibledata compression , and the operational meaning of theShannon entropy .The source coding theorem shows that (in the limit, as the length of a stream of i.i.d. data tends to infinity) it is impossible to compress the data such that the code rate (average number of bits per symbol) is less than the Shannon entropy of the source, without it being virtually certain that information will be lost. However it is possible to get the code rate arbitrarily close to the Shannon entropy, with negligible probability of loss.
The source coding theorem for symbol codes places an upper and a lower bound on the minimal possible expected length of codewords as a function of the entropy of the input word (which is viewed as a
random variable ) and of the size of the target alphabet.Statements
"Source coding" is a mapping from (a sequence of) symbols from an information source to a sequence of alphabet symbols (usually bits) such that the source symbols can be exactly recovered from the binary bits (lossless source coding) or recovered within some distortion (lossy source coding). This is the concept behind
data compression .Source coding theorem
In information theory, the source coding theorem (Shannon 1948) informally states that:
:"N" i.i.d. random variables each with entropy "H(X)" can be compressed into more than "NH(X)"
bits with negligible risk of information loss, as "N" tends to infinity; but conversely, if they are compressed into fewer than "NH(X)" bits it is virtually certain that information will be lost." (MacKay 2003).Source coding theorem for symbol codes
Let , denote two finite alphabets and let and denote the set of all finite words from those alphabets (respectively).
Suppose that "X" is a random variable taking values in and let "f" be a decipherable code from to where . Let "S" denote the random variable given by the wordlength "f(X)".
If "f" is optimal in the sense that it has the minimal expected wordlength for "X", then
:(Shannon 1948)
Proof: Source coding theorem
Given is an i.i.d. source, its
time series "X"1, ..., "X""n" is i.i.d. withentropy "H"("X") in the discrete-valued case anddifferential entropy in the continuous-valued case. The Source coding theorem states that for any for any rate larger than theentropy of the source, there is large enough and an encoder that takes i.i.d. repetition of the source, , and maps it to binary bits such that the source symbols are recoverable from the binary bits with probability at least .Proof for achievability
Fix some for any . The typical set,, is defined as follows:
Wikimedia Foundation. 2010.