Noisy channel model

Noisy channel model

The noisy channel model is a framework used in spell checkers, question answering, speech recognition, and machine translation. In this model, the goal is to find the intended word given a word where the letters have been scrambled in some manner.

Contents

Definition

Given an alphabet Σ, let Σ * be the set of all finite strings over Σ. Let the dictionary D of valid words be some subset of Σ * , i.e., D\subseteq\Sigma^*.

The noisy channel is the matrix

Γws = Pr(s | w),

where w\in D is the intended word and s\in\Sigma^* is the scrambled word that was actually received.

Example

Consider the English alphabet Σ = {a,b,c,...,y,z,A,B,...,Z,...}. Some subset D\subseteq\Sigma^* makes up the dictionary of valid English words.

There are several mistakes that may occur while typing, including:

  1. Missing letters, e.g., leter instead of letter
  2. Accidental letter additions, e.g., misstake instead of mistake
  3. Swapping letters, e.g., recieved instead of received
  4. Replacing letters, e.g., fimite instead of finite

To construct the noisy channel matrix Γ, we must consider the probability of each mistake, given the intended word (Pr(s | w) for all w\in D and s\in\Sigma^*). These probabilities may be gathered, for example, by considering the Levenshtein distance between s and w or by comparing the draft of an essay with one that has been manually edited for spelling.

Error-correction

The goal of the noisy channel model is to find the intended word given the scrambled word that was received. The decision function \sigma : \Sigma^* \to D is a function that, given a scrambled word, returns the intended word.

Methods of constructing a decision function include the maximum likelihood rule, the maximum a posteriori rule, and the minimum distance rule.

In some cases, it may be better to accept the scrambled word as the intended word rather than attempt to find an intended word in the dictionary. For example, the word schönfinkeling may not be in the dictionary, but might in fact be the intended word.

See also

References


Wikimedia Foundation. 2010.

Игры ⚽ Поможем решить контрольную работу

Look at other dictionaries:

  • Channel capacity — In electrical engineering, computer science and information theory, channel capacity is the tightest upper bound on the amount of information that can be reliably transmitted over a communications channel. By the noisy channel coding theorem, the …   Wikipedia

  • Binary erasure channel — A binary erasure channel (or BEC) is a common communications channel model used in coding theory and information theory. In this model, a transmitter sends a bit (a zero or a one), and the receiver either receives the bit or it receives a message …   Wikipedia

  • Binary symmetric channel — A binary symmetric channel (or BSC) is a common communications channel model used in coding theory and information theory. In this model, a transmitter wishes to send a bit (a zero or a one), and the receiver receives a bit. It is assumed that… …   Wikipedia

  • information theory — the mathematical theory concerned with the content, transmission, storage, and retrieval of information, usually in the form of messages or data, and esp. by means of computers. [1945 50] * * * ▪ mathematics Introduction       a mathematical… …   Universalium

  • Information theory — Not to be confused with Information science. Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory was developed by Claude E. Shannon to find fundamental… …   Wikipedia

  • Power line communication — or power line carrier (PLC), also known as power line digital subscriber line (PDSL), mains communication, power line telecom (PLT), power line networking (PLN), or broadband over power lines (BPL) are systems for carrying data on a conductor… …   Wikipedia

  • Matched filter — In telecommunications, a matched filter (originally known as a North filter[1]) is obtained by correlating a known signal, or template, with an unknown signal to detect the presence of the template in the unknown signal. This is equivalent to… …   Wikipedia

  • Orthogonal frequency-division multiplexing — Passband modulation v · d · e Analog modulation AM · …   Wikipedia

  • Claude Shannon — Claude Elwood Shannon (1916 2001) Born April …   Wikipedia

  • History of information theory — The decisive event which established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of Claude E. Shannon s classic paper A Mathematical Theory of Communication in the Bell System… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”