Example-based machine translation

Example-based machine translation

The Example-based machine translation (EBMT) approach to machine translation is often characterized by its use of a bilingual corpus with "parallel texts" as its main knowledge base, at run-time. It is essentially a translation by analogy and can be viewed as an implementation of case-based reasoning approach of machine learning.

At the foundation of example-based machine translation is the idea of translation by analogy. When applied to the process of human translation, the idea that translation takes place by analogy is a rejection of the idea that people translate sentences by doing deep linguistic analysis. Instead it is founded on the belief that people translate firstly by decomposing a sentence into certain phrases, then by translating these phrases, and finally by properly composing these fragments into one long sentence. Phrasal translations are translated by analogy to previous translations. The principle of translation by analogy is encoded to example-based machine translation through the example translations that are used to train such a system.

Example-based machine translation systems are trained from bilingual parallel corpora, which contain sentence pairs like the example shown in the table. Sentence pairs contain sentences in one language with their translations into another. The particular example shows an example of a "minimal pair", meaning that the sentences vary by just one element. These sentences make it simple to learn translations of subsentential units. For example, an example-based machine translation system would learn three units of translation:
# "How much is that" X "?" corresponds to "Ano X wa ikura desu ka."
# "red umbrella" corresponds to "akai kasa"
# "small camera" corresponds to "chiisai kamera"

Composing these units can be used to produce novel translations in the future. For example, if we have been trained using some text containing the sentences:

"President Kennedy was shot dead during the parade." and "The convict escaped on July 15th." We could translate the sentence "The convict was shot dead during the parade." by substituting the appropriate parts of the sentences.

Other approaches to machine translation, including statistical machine translation, also use bilingual corpora to learn the process of translation.

Example based machine translation was first suggested by Nagao Makoto in 1984. [Cite book
author = Makoto Nagao
title = Artificial and Human Intelligence
chapter = A framework of a mechanical translation between Japanese and English by analogy principle
editor = A. Elithorn and R. Banerji
year = 1984
publisher = Elsevier Science Publishers
url = http://www.mt-archive.info/Nagao-1984.pdf
] It soon attracted the attention of scientists in the field of natural language processing.

ee also

* Machine translation
* Machine learning
* Statistical machine translation
* Translation memory

External links


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Dictionary-based machine translation — Machine translation can use a method based on dictionary entries, which means that the words will be translated as a dictionary does – word by word, usually without much correlation of meaning between them. Dictionary lookups may be done with or… …   Wikipedia

  • Transfer-based machine translation — is a type of machine translation, it is based on the idea of interlingua and is currently one of the most widely used methods of machine translationOverviewBoth transfer based and interlingua based machine translation have the same idea: to make… …   Wikipedia

  • Machine translation — Part of a series on Translation Types Language interpretation …   Wikipedia

  • History of machine translation — The history of machine translation generally starts in the 1950s, although work can be found from earlier periods. The Georgetown experiment in 1954 involved fully automatic translation of more than sixty Russian sentences into English. The… …   Wikipedia

  • Statistical machine translation — (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. The statistical approach contrasts with the rule based… …   Wikipedia

  • Machine translation software usability — The sections below give objective criteria for evaluating the usability of machine translation software output. Stationarity or Canonical Form Do repeated translations converge on a single expression in both languages? I.e. does the translation… …   Wikipedia

  • Interlingual machine translation — is one of the classic approaches to machine translation. In this approach, the source language, i.e. the text to be translated is transformed into an interlingua, i.e., an abstract language independent representation. The target language is then… …   Wikipedia

  • Evaluation of machine translation — Various methods for the evaluation for machine translation have been employed. This article will focus on the evaluation of the output of machine translation, rather than on performance or usability evaluation.Before covering the large scale… …   Wikipedia

  • Comparison of machine translation applications — A machine translation application is a program which can translate text or speech from one natural language to another. Machine translation applications are essential to the modern language industry. Please see the individual products articles… …   Wikipedia

  • Translation — For other uses, see Translation (disambiguation). Translator redirects here. For other uses, see Translator (disambiguation). Contents 1 Etymology 2 Theory …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”