- Eurotra
Eurotra was an ambitious
machine translation project established and funded by theEuropean Commission from the late 1970s until 1994.Emboldened by modest success with an older, commercially-developed machine translation system
SYSTRAN , a large network of European computational linguists embarked upon the Eurotra project with the hope of creating a state-of-the-art MT system for the then seven, later nine, official languages of theEuropean Community .However, as time passed, expectations became tempered; "Fully Automatic High Quality Translation" was not a reasonably attainable goal. The true character of Eurotra was eventually acknowledged to be in fact pre-competitive research rather than prototype development.
The project was motivated by one of the founding principles of the EU: that all citizens had the right to read any and all proceedings of the Commission in their own language. As more countries joined, this produced a
combinatorial explosion in the number of language pairs involved, and the need to translate every paper, speech and even set of meeting minutes produced by the EU into the other eight languages meant that translation rapidly became the overwhelming component in the administrative budget. To solve this problem Eurotra was devised.The project was unusual in that rather than consisting of a single research team, it had member groups of six to twelve distributed around the member countries, with at least one in each one (
Belgium ,Greece and theUnited Kingdom each had two, and there was an additional secretariat based at theEuropean Commission inBrussels . While this contributed significantly to the culture of the project, it also demonstrated graphically Brooks' assertion inThe Mythical Man-Month that adding personnel to a project results in it taking longer to complete; the more the number of groups involved, the more time is spent on administration and communication rather than actual research per se.The actual design of the project was unusual as MT projects go. Older systems, such as SYSTRAN, were heavily dictionary-based, with minor support for rearranging word order. More recent systems have often worked on a probabilistic approach, based on its source corpora. Eurotra addressed the
constituent structure of the text to be translated, going through first a syntacticparse followed by a second parse to produce a dependency structure followed by a final parse with a third grammar to produce what was referred internally as Intermediate Representation (IR). Since all three modules were implemented asProlog programs, it would then in principle be possible to put this structure backwards through the corresponding modules for another language to produce a translated text in any of the other languages. However, it is unknown whether this was in fact possible.The first "live" translation occupied a 4Mb
Microvax runningUltrix and C-Prolog for a complete weekend some time in early 1987. The sentence, translated from English into Danish, was "Japan makes computers". The main problem faced by the system was the generation of so-called "Parse Forests" - often a large number of different grammar rules could be applied to any particular phrase, producing hundreds, even thousands of (often identical) parse trees. This used up huge quantities of computer store, slowing the whole process down unnecessarily.While Eurotra never delivered a "working" MT system, the project made a far-reaching long-term impact on the nascent language industries in European member states, in particular among the southern countries of
Greece ,Italy ,Spain , andPortugal . There is at least one commercial MT system (developed by an academic/commercial consortium inDenmark ) derived from Eurotra technology.
Wikimedia Foundation. 2010.