Corpora in Translation Studies

Corpora in Translation Studies

Corpora in Translation Studies Gradually the translator’s workplace has changed over the last ten years ago. And today computer could be considered an important or even essential tool in translation. However the computer doesn’t substitute traditional tools such as monolingual and bilingual dictionaries, terminologies and encyclopaedias on paper or in digital format. Personal computers have the capacity of process information easier and quicker than ever before. However the problem is although we can find a piece of information we need to find the right and reliable information.

Here Corpora and concordancing software play an important role since gaining access to information about language, content and translation practices which was hardly available to translators before the present stage of ICT development.

Contents

Machine Translation based in Corpus

Machine translation from corpus linguistics is based in the analysis of real samples with its own translations. Between the different devices that use corpus, there are statistical methods and based in examples.

Statistic

The main objective of statistic machine translation is to generate translations from statistical methods based in corpus of bilingual texts. For instance European parliament minutes are written in all of EU ( European Union)official languages. If there would be more of this corpus, we would get excellent results of translation of texts about those subjects. The first statistic machine translation program was CANDIDE by IBM.

Based in examples

Machine translation based in examples, is well known for using a bilingual corpus as main source of knowledge. Basically it’s an analogical translation and could be interpreted as a practice of cases reasoning used in automatic learning , which consist in solving a problem basing in solutions of others similar problems.

Corpora and Translation

Translation typology

According to EAGLES, we can make a general distinction between Monolingual and Multilingual corpora. At the same time in multilingual corpora , we can distinguish between: Comparable corpora: Corpora compiled using similar design criteria but which are not translations).

Parallel or Translation corpora: which are texts in one language aligned with their translation in another. We have to take into account several variables like directness of translation , number of languages, etc.

There are so much Monolingual Comparable Corpora ( corpus composed in two sub-sections one of original texts in one language and the other texts translated into the same language. It’s useful for translation theorists and researchers but Professional technical translators use translation memories.

Defining Translation memories

Translation memory is a very specific type of parallel corpus in that:

a)It is “propietory” TMs are created individually or collectively around specific translation projects. b)TMs tends to closure, standarizated and restricted range of linguistic options.

Translation workbenches and TMs could be considered the most successful translation tool; however it’s restricted to specific text types.

Corpora aids in Translation

The previous kinds of corpora can be combined with other tools like dictionary for example. Corpora can function as general or specialized dictionaries. On that way, comparable corpora can be seen as a monolingual dictionary and Parallel corpora could be compared to bilingual dictionary.

Corpus resources for Translators

Not all dictionaries are the same , and either are all corpora. Apart from translation memories, corpus resources with a potencial use for professional translators could be classified from “robust” to “virtual”.

Many examples of corpora could be BNC ( British National Corpus) or the Spanish corpus CREA or the Italian one CORIS and so on.

It’s important to mention the difference by corpus linguistics between corpora and archives of electronic texts, the second one is only a repertory of electronic texts. Building a corpus of web pages implies an information retrieval operation, in order to locate relevant and reliable documents.

In many translation classes students have made their own corpora with DIY ( Do it yourself) corpora. The main beneficts of DIY corpora may be summarize as: • They are easy to make

• They are great resource for content information.

• They are a great resource for content information.

• They are a great resource for terminology and phraseology.

• Not all topics, not all types and not all languages are available.

• The relevance and reliability of documents need to be carrefully assessed.

• Existing concordancing software isn’t well equipped HTML or XML files.

Finally the advantages of “robust” corpora that we can see over “virtual” corpora are follows:

• They are usually more reliable

• They are usually larger.

• They may be improved with linguistic and contextual information.

References

Baker, M (1993). "Corpus linguistics and translation studies. "Implications and applications" in M. Baker G. Francis & E. Tognini-Bonelli (eds.) Text and tecnology. Philadelphia/ Amsterdam: John Benjamins, 232- 252.

Scott, M. (1996) Wordsmith tools .Oxford: Oxford University Press.

Zanettin, Federico. Corpora in Translation Practice. Universitá per Stranieri di Perugia.

External links


Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • National Translation Mission — (NTM) is a Government of India initiative to make knowledge texts accessible, in all Indian languages listed in the VIII schedule of the Constitution, through translation. NTM was set up as per National Knowledge Commission s recommendation. The… …   Wikipedia

  • Memoria de traducción — Las memorias de traducción son almacenes compuestos de textos originales en una lengua alineados con su traducción en otra(s). Esta definición de memorias de traducción coincide literalmente con una de las definiciones más aceptadas de corpus… …   Wikipedia Español

  • Parallel text — A parallel text is a text placed alongside its translation or translations. Parallel text alignment is the identification of the corresponding sentences in both halves of the parallel text. The Loeb Classical Library and the Clay Sanskrit Library …   Wikipedia

  • Contrastive linguistics — is a practice oriented linguistic approach that seeks to describe the differences and similarities between a pair of languages (hence it is occasionally called differential linguistics ). Contents 1 History 2 See also 3 Notes …   Wikipedia

  • BIBLE — THE CANON, TEXT, AND EDITIONS canon general titles the canon the significance of the canon the process of canonization contents and titles of the books the tripartite canon …   Encyclopedia of Judaism

  • MESOPOTAMIA — The original article in the first edition of the Encyclopaedia Judaica traced Mesopotamian history to its earliest beginnings and provided a detailed survey of Mesopotamian literature and institutions. With the availability of such tools as J.… …   Encyclopedia of Judaism

  • Corpus linguistics — is the study of language as expressed in samples (corpora) or real world text. This method represents a digestive approach to deriving a set of abstract rules by which a natural language is governed or else relates to another language. Originally …   Wikipedia

  • Computer-assisted language learning — (CALL) is succinctly defined in a seminal work by Levy (1997: p. 1) as the search for and study of applications of the computer in language teaching and learning .[1] CALL embraces a wide range of ICT applications and approaches to teaching… …   Wikipedia

  • Cyrus Cylinder — The Cyrus Cylinder, obverse and reverse sides …   Wikipedia

  • epigraphy — epigraphist, epigrapher, n. /i pig reuh fee/, n. 1. the study or science of epigraphs or inscriptions, esp. of ancient inscriptions. 2. inscriptions collectively. [1850 55; EPIGRAPH + Y3] * * * ▪ historiography Introduction  the study of written… …   Universalium

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”