Machine translation software usability

Machine translation software usability

The sections below give objective criteria for evaluating the usability of machine translation software output.

Stationarity or Canonical Form

Do repeated translations converge on a single expression in both languages? I.e. does the translation method show stationarity or produce a canonical form. In the above example, the translation does become stationary, although the original meaning is lost. See Round-trip translation for further discussion. This metric has been criticized as not being well correlated with Bilingual Evaluation Understudy scores [Somers, H. (2005) " [http://www.co.umist.ac.uk/~harold/RoundTrip.doc Round-trip Translation: What Is It Good For?] "]

Adaptive to colloquialism, argot or slang

Is the system adaptive to colloquialism, argot or slang? The French language has many rules for creating words in the speech and writing of popular culture. Two such rules are: (a) The reverse spelling of words such as "femme" to "meuf". (This is called verlan.) (b) The attachment of the suffix "-ard" to a noun or verb to form a proper noun. For example, the noun "faluche" means "student hat". The word "faluchard" formed from "faluche" colloquially can mean, depending on context, "a group of students", "a gathering of students" and "behavior typical of a student". The Google translator as of 28 December 2006 doesn't derive the constructed words as for example from rule (b), as shown here:

Il y a une chorale falucharde mercredi, venez nombreux, les faluchards chantent des paillardes! => "There is a choral society falucharde Wednesday, come many, the faluchards sing loose-living women!"

French argot has three levels of usage: [ [http://chitlinsandcamembert.blogspot.com/2005/10/agony-of-argot.html "The Agony of Argot", Chitlins & Camembert, October 28, 2005] ]
#"familier" or friendly, acceptable among friends, family and peers but not at work
#"grossier" or swear words, acceptable among friends and peers but not at work or in family
#"verlan" or ghetto slang, acceptable among lower classes but not among middle or upper classes

The United States National Institute of Standards and Technology conducts annual evaluations [http://www.nist.gov/speech/tests/mt/] of machine translation systems based on the BLEU-4 criterion [http://www.nist.gov/speech/tests/mt/doc/mt06_evalplan.v4.pdf] . A combined method called IQmt which incorporates BLEU and additional metrics NIST, GTM, ROUGE and METEOR has been implemeneted by Gimenez and Amigo [http://www.lsi.upc.edu/~nlp/IQMT/IQMT.v1.0.pdf] .

Well-formed output

Is the output grammatical or well-formed in the target language? Using an interlingua should be helpful in this regard, because with a fixed interlingua one should be able to write a grammatical mapping to the target language from the interlingua. Consider the following Arabic language input and English language translation result from the Google translator as of 27 December 2006 [http://www.google.com/language_tools?hl=en] . This Google translator output doesn't parse using a reasonable English grammar:

وعن حوادث التدافع عند شعيرة رمي الجمرات -التي كثيرا ما يسقط فيها العديد من الضحايا- أشار الأمير نايف إلى إدخال "تحسينات كثيرة في جسر الجمرات ستمنع بإذن الله حدوث أي تزاحم". => And incidents at the push Carbuncles-throwing ritual, which often fall where many of the victims - Prince Nayef pointed to the introduction of "many improvements in bridge Carbuncles God would stop the occurrence of any competing."

Semantics preservation

Do repeated re-translations preserve the semantics of the original sentence? For example, consider the following English input passed multiple times into and out of French using the Google translator as of 27 December 2006:

Better a day earlier than a day late. => "Améliorer un jour plus tôt qu'un jour tard." => To improve one day earlier than a day late. => "Pour améliorer un jour plus tôt qu'un jour tard." => To improve one day earlier than a day late.

Trustworthiness and Security

An interesting peculiarity of [http://www.google.com/translate_t?langpair=en|es Google Translate] as of 24 January 2008 (corrected as of 25 January 2008) is the following result when translating from English to Spanish, which shows an embedded joke in the English-Spanish dictionary which has some added poignancy given recent events:

Heath Ledger is dead => "Tom Cruise está muerto"
This raises the issue of trustworthiness when relying on a machine translation system embedded in a Life-critical system in which the translation system has input to a Safety Critical Decision Making process. Conjointly it raises the issue of whether in a given use the software of the machine translation system is safe from hackers.

It is not known whether this feature of Google Translate was the result of a joke/hack or perhaps an unintended consequence of the use of a method such as statistical machine translation. Reporters from CNET Networks asked Google for an explanation on January 24, 2008; Google said only that it was an "internal issue with Google Translate". [ [http://www.news.com/8301-13577_3-9857280-36.html?tag=newsmap "Google Translate bug mixes up Heath Ledger, Tom Cruise", by Caroline McCarthy, CNET Networks, January 24,2008] ] The mistranslation was the subject of much hilarity and speculation on the Internet. [ [http://gawker.com/5002510/tom-cruise-is-spanish-for-heath-ledger '"Tom Cruise" is Spanish for "Heath Ledger"', gawker.com, January 24, 2008] ] [ [http://rayhey2.blogspot.com/2008/01/tom-cruise-est-muerto.html "Tom Cruise está muerto", Ray Leon Blog Project, January 24, 2008] ]

If it is an unintended consequence of the use of a method such as statistical machine translation, and not a joke/hack, then this event is a demonstration of a potential source of critical unreliability in the statistical machine translation method.

In human translations, in particular on the part of interpreters, selectivity on the part of the translator in performing a translation is often commented on when one of the two parties being served by the interpreter knows both languages.

This leads to the issue of whether a particular translation could be considered "verifiable". In this case, a converging round-trip translation would be a kind of verification.

Notes

References

* Gimenez, Jesus and Enrique Amigo. (2005) [http://www.lsi.upc.edu/~nlp/IQMT/IQMT.v1.0.pdf IQmt: A framework for machine translation evaluation] .

* NIST. [http://www.nist.gov/speech/tests/mt Annual machine translation system evaluations] and [http://www.nist.gov/speech/tests/mt/doc/mt06_evalplan.v4.pdf evaluation plan] .

* Papineni, Kishore, Salim Roukos, Todd Ward and Wei-Jing Zhu. (2002) BLEU: A Method for automatic evaluation of machine translation. Proc. 40th Annual Meeting of the ACL, July, 2002, pp. 311-318.

ee also

* Evaluation of machine translation
* Round-trip translation
* Translation


Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

  • Evaluation of machine translation — Various methods for the evaluation for machine translation have been employed. This article will focus on the evaluation of the output of machine translation, rather than on performance or usability evaluation.Before covering the large scale… …   Wikipedia

  • Translation memory — A translation memory, or TM, is a type of database that stores segments that have been previously translated. A translation memory system stores the words, phrases and paragraphs that have already been translated and aid human translators. The… …   Wikipedia

  • National Translation Mission — (NTM) is a Government of India initiative to make knowledge texts accessible, in all Indian languages listed in the VIII schedule of the Constitution, through translation. NTM was set up as per National Knowledge Commission s recommendation. The… …   Wikipedia

  • Keyboard layout — QWERTY keyboard on a laptop. A keyboard layout is any specific mechanical, visual, or functional arrangement of the keys, legends, or key–meaning associations (respectively) of a computer, typewriter, or other typographic keyboard. Mechanical… …   Wikipedia

  • Radio-frequency identification — (RFID) is a technology that uses radio waves to transfer data from an electronic tag, called RFID tag or label, attached to an object, through a reader for the purpose of identifying and tracking the object. Some RFID tags can be read from… …   Wikipedia

  • Data compression — Source coding redirects here. For the term in computer programming, see Source code. In computer science and information theory, data compression, source coding or bit rate reduction is the process of encoding information using fewer bits than… …   Wikipedia

  • Computer programming — Programming redirects here. For other uses, see Programming (disambiguation). Software development process Activities and steps …   Wikipedia

  • History of CP/CMS — This lengthy article explores the History of CP/CMS the historical context in which this important IBM time sharing operating system was built. It provides details to support for the main CP/CMS and History of IBM articles, drawing on source… …   Wikipedia

  • Computer science — or computing science (abbreviated CS) is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems. Computer scientists invent algorithmic… …   Wikipedia

  • Linux distribution — A Linux distribution is a member of the family of Unix like operating systems built on top of the Linux kernel. Such distributions (often called distros for short) are operating systems including a large collection of software applications such… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”