 Coding theory

Coding theory is the study of the properties of codes and their fitness for a specific application. Codes are used for data compression, cryptography, errorcorrection and more recently also for network coding. Codes are studied by various scientific disciplines—such as information theory, electrical engineering, mathematics, and computer science—for the purpose of designing efficient and reliable data transmission methods. This typically involves the removal of redundancy and the correction (or detection) of errors in the transmitted data.
There are essentially two aspects to Coding theory:
 Data compression (or, source coding)
 Error correction (or, channel coding).
These two aspects may be studied in combination. Source encoding, attempts to compress the data from a source in order to transmit it more efficiently. This practice is found every day on the Internet where the common Zip data compression is used to reduce the network load and make files smaller. The second, channel encoding, adds extra data bits to make the transmission of data more robust to disturbances present on the transmission channel. The ordinary user may not be aware of many applications using channel coding. A typical music CD uses the ReedSolomon code to correct for scratches and dust. In this application the transmission channel is the CD itself. Cell phones also use coding techniques to correct for the fading and noise of high frequency radio transmission. Data modems, telephone transmissions, and NASA all employ channel coding techniques to get the bits through, for example the turbo code and LDPC codes.
Contents
Source coding
Main article: data compressionThe aim of source coding is to take the source data and make it smaller.
Principle
Entropy of a source is the measure of information. Basically source codes try to reduce the redundancy present in the source, and represent the source with fewer bits that carry more information.
Data compression which explicitly tries to minimize the average length of messages according to a particular assumed probability model is called entropy encoding.
Various techniques used by source coding schemes try to achieve the limit of Entropy of the source. C(x) ≥ H(x), where H(x) is entropy of source (bitrate), and C(x) is the bitrate after compression. In particular, no source coding scheme can be better than the entropy of the source.
Example
Facsimile transmission uses a simple run length code. Source coding includes also removal of all data that superfluous the need of transmitter, this decreases the bandwidth required for the transmission process.
Channel coding
Main article: Forward error correctionThe aim of channel coding theory is to find codes which transmit quickly, contain many valid code words and can correct or at least detect many errors. While not mutually exclusive, performance in these areas is a trade off. So, different codes are optimal for different applications. The needed properties of this code mainly depend on the probability of errors happening during transmission. In a typical CD, the impairment is mainly dust or scratches. Thus codes are used in an interleaved manner.^{[citation needed]} The data is spread out over the disk. Although not a very good code, a simple repeat code can serve as an understandable example. Suppose we take a block of data bits (representing sound) and send it three times. At the receiver we will examine the three repetitions bit by bit and take a majority vote. The twist on this is that we don't merely send the bits in order. We interleave them. The block of data bits is first divided into 4 smaller blocks. Then we cycle through the block and send one bit from the first, then the second, etc. This is done three times to spread the data out over the surface of the disk. In the context of the simple repeat code, this may not appear effective. However, there are more powerful codes known which are very effective at correcting the "burst" error of a scratch or a dust spot when this interleaving technique is used.
Other codes are more appropriate for different applications. Deep space communications are limited by the thermal noise of the receiver which is more of a continuous nature than a bursty nature. Likewise, narrowband modems are limited by the noise, present in the telephone network and also modeled better as a continuous disturbance.^{[citation needed]} Cell phones are subject to rapid fading. The high frequencies used can cause rapid fading of the signal even if the receiver is moved a few inches. Again there are a class of channel codes that are designed to combat fading.^{[citation needed]}
Linear codes
Main article: Linear codeThe term algebraic coding theory denotes the subfield of coding theory where the properties of codes are expressed in algebraic terms and then further researched.^{[citation needed]}
Algebraic coding theory is basically divided into two major types of codes:^{[citation needed]}
 Linear block codes
 Convolutional codes.
It analyzes the following three properties of a code – mainly:^{[citation needed]}
 code word length
 total number of valid code words
 the minimum distance between two valid code words, using mainly the Hamming distance, sometimes also other distances like the Lee distance.
Linear block codes
Main article: Block codeLinear block codes have the property of linearity, i.e the sum of any two codewords is also a code word, and they are applied to the source bits in blocks, hence the name linear block codes. There are block codes that are not linear, but it is difficult to prove that a code is a good one without this property.^{[1]}
Linear block codes are summarized by their symbol alphabets (e.g., binary or ternary) and parameters (n,m,d_{min})^{[2]} where
 n is the length of the codeword, in symbols,
 m is the number of source symbols that will be used for encoding at once,
 d_{min} is the minimum hamming distance for the code.
There are many types of linear block codes, such as
 Cyclic codes (e.g., Hamming codes)
 Repetition codes
 Parity codes
 Polynomial codes (e.g., BCH codes)
 Reed–Solomon codes
 Algebraic geometric codes
 Reed–Muller codes
 Perfect codes.
Block codes are tied to the sphere packing problem, which has received some attention over the years. In two dimensions, it is easy to visualize. Take a bunch of pennies flat on the table and push them together. The result is a hexagon pattern like a bee's nest. But block codes rely on more dimensions which cannot easily be visualized. The powerful (24,12) Golay code used in deep space communications uses 24 dimensions. If used as a binary code (which it usually is) the dimensions refer to the length of the codeword as defined above.
The theory of coding uses the Ndimensional sphere model. For example, how many pennies can be packed into a circle on a tabletop, or in 3 dimensions, how many marbles can be packed into a globe. Other considerations enter the choice of a code. For example, hexagon packing into the constraint of a rectangular box will leave empty space at the corners. As the dimensions get larger, the percentage of empty space grows smaller. But at certain dimensions, the packing uses all the space and these codes are the socalled "perfect" codes. The only nontrivial and useful perfect codes are the distance3 Hamming codes with parameters satisfying (2^{r} – 1, 2^{r} – 1 – r, 3), and the [23,12,7] binary and [11,6,5] ternary Golay codes.^{[1]}^{[2]}
Another code property is the number of neighbors that a single codeword may have.^{[3]} Again, consider pennies as an example. First we pack the pennies in a rectangular grid. Each penny will have 4 near neighbors (and 4 at the corners which are farther away). In a hexagon, each penny will have 6 near neighbors. When we increase the dimensions, the number of near neighbors increases very rapidly. The result is the number of ways for noise to make the receiver choose a neighbor (hence an error) grows as well. This is a fundamental limitation of block codes, and indeed all codes. It may be harder to cause an error to a single neighbor, but the number of neighbors can be large enough so the total error probability actually suffers.^{[3]}
Properties of linear block codes are used in many applications. For example, the syndromecoset uniqueness property of linear block codes is used in trellis shaping,^{[4]} one of the best known shaping codes. This same property is used in sensor networks for distributed source coding
Convolutional codes
Main article: Convolutional codeThe idea behind a convolutional code is to make every codeword symbol be the weighted sum of the various input message symbols. This is like convolution used in LTI systems to find the output of a system, when you know the input and impulse response.
So we generally find the output of the system convolutional encoder, which is the convolution of the input bit, against the states of the convolution encoder, registers.
Fundamentally, convolutional codes do not offer more protection against noise than an equivalent block code. In many cases, they generally offer greater simplicity of implementation over a block code of equal power. The encoder is usually a simple circuit which has state memory and some feedback logic, normally XOR gates. The decoder can be implemented in software or firmware.
The Viterbi algorithm is the optimum algorithm used to decode convolutional codes. There are simplifications to reduce the computational load. They rely on searching only the most likely paths. Although not optimum, they have generally found to give good results in the lower noise environments.
Convolutional codes are used in voiceband modems (V.32, V.17, V.34) and in GSM mobile phones, as well as satellite and military communication devices.
Other applications of coding theory
Another concern of coding theory is designing codes that help synchronization. A code may be designed so that a phase shift can be easily detected and corrected and that multiple signals can be sent on the same channel.^{[citation needed]}
Another application of codes, used in some mobile phone systems, is codedivision multiple access (CDMA). Each phone is assigned a code sequence that is approximately uncorrelated with the codes of other phones.^{[citation needed]} When transmitting, the code word is used to modulate the data bits representing the voice message. At the receiver, a demodulation process is performed to recover the data. The properties of this class of codes allow many users (with different codes) to use the same radio channel at the same time. To the receiver, the signals of other users will appear to the demodulator only as a lowlevel noise.^{[citation needed]}
Another general class of codes are the automatic repeatrequest (ARQ) codes. In these codes the sender adds redundancy to each message for error checking, usually by adding check bits. If the check bits are not consistent with the rest of the message when it arrives, the receiver will ask the sender to retransmit the message. All but the simplest wide area network protocols use ARQ. Common protocols include SDLC (IBM), TCP (Internet), X.25 (International) and many others. There is an extensive field of research on this topic because of the problem of matching a rejected packet against a new packet. Is it a new one or is it a retransmission? Typically numbering schemes are used, as in TCP."RFC793". RFCs. Internet Engineering Task Force (IETF). 198109. http://tools.ietf.org/html/rfc793.
Group Testing
Group testing uses codes in a different way. Consider a large group of items in which a very few are different in a particular way (for eg. Defective products or infected test subjects). The idea of group testing is to determine which items are "different" by using as few tests as possible. The origin of the problem has its roots in the Second World War when the United States Army Air Forces needed to test its soldiers for Syphilis. It originated from a groundbreaking paper by Robert Dorfman.
Analog coding
Information is encoded analogously in the neural networks of brains, in analog signal processing, and analog electronics. Aspects of analog coding include analog error correction,^{[5]} analog data compression.^{[6]} analog encryption^{[7]}
Neural coding
Neural coding is a neurosciencerelated field concerned with how sensory and other information is represented in the brain by networks of neurons. The main goal of studying neural coding is to characterize the relationship between the stimulus and the individual or ensemble neuronal responses and the relationship among electrical activity of the neurons in the ensemble.^{[8]} It is thought that neurons can encode both digital and analog information,^{[9]} and that neurons follow the principles of information theory and compress information,^{[10]} and detect and correct^{[11]} errors in the signals that are sent throughout the brain and wider nervous system.
See also
 Coding gain
 Covering code
 Errorcorrecting code
 Group testing
 Hamming distance, Hamming weight
 Information theory
 Lee distance
 Spatial coding and MIMO in multiple antenna research
 Spatial diversity coding is spatial coding that transmits replicas of the information signal along different spatial paths, so as to increase the reliability of the data transmission.
 Spatial interference cancellation coding
 Spatial multiplex coding
 Timeline of information theory, data compression, and error correcting codes
 List of algebraic coding theory topics
Notes
 ^ ^{a} ^{b} Audrey Terras (1999). Fourier Analysis on Finite Groups and Applications. Cambridge University Press. ISBN 0521457181. http://books.google.com/books?id=B2TA669dJMC&pg=PA195&dq=linear+block+code+difficult+prove+nonlinear#PPA195,M1.
 ^ ^{a} ^{b} Richard E. Blahut (2003). Algebraic Codes for Data Transmission. Cambridge University Press. ISBN 0521553741. http://books.google.com/books?id=n0XHMY58tL8C&pg=PA60&dq=golay+hamming+only+perfect.
 ^ ^{a} ^{b} Christian Schlegel and Lance Pérez (2004). Trellis and turbo coding. WileyIEEE. p. 73. ISBN 9780471227557. http://books.google.com/books?id=9wRCjfGAaEcC&pg=PA73.
 ^ G.D. Forney, Jr. (March 1992), Trellis shaping, IEEE Transactions on Information Theory, Vol. 38, Issue 2, Part 2.
 ^ Analog ErrorCorrecting Codes Based on Chaotic Dynamical Systems, Brian Chen and Gregory W. Wornell, IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 46, NO. 7, JULY 1998
 ^ On Analog Signature Analysis, Franc Novak Bojan Hvala, Sandi Klavžar, "Proceedings of the conference on Design, automation and test in Europe", 1999, ISBN 1581131216
 ^ Cryptanalyzing an Encryption Scheme Based on Blind Source Separation, Shujun Li, Chengqing Li, KwokTung Lo, Guanrong Chen, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 55, NO. 4, PAGES 10551063, APRIL 2008
 ^ Brown EN, Kass RE, and Mitra PP. 2004. Multiple neural spike train data analysis: stateoftheart and future challenges. Nature Neuroscience 7:45661
 ^ Spike arrival times: A highly efficient coding scheme for neural networks, SJ Thorpe  Parallel processing in neural systems, 1990
 ^ Information Distortion and Neural Coding, Tom´aˇs Gedeon Albert E. Parker, Alexander G. Dimitrov
 ^ Spike timing precision and neural error correction: Local behavior, M Stiber  Neural computation, 2005
References
 Vera Pless (1982), Introduction to the Theory of ErrorCorrecting Codes, John Wiley & Sons, Inc., ISBN 0471086843.
 Elwyn R. Berlekamp (1984), Algebraic Coding Theory, Aegean Park Press (revised edition), ISBN 0894120638, ISBN 9780894120633.
 Randy Yates, A Coding Theory Tutorial.
Categories:
Wikimedia Foundation. 2010.