Post-édition de TA neuronale à la DGT et qualité des textes finaux : étude de cas     
 /
Neural Machine Translation Post-Editing in DGT and Final Text Quality: A Case Study

De Faria Pires, Loïc

Traduction et Langues
Volume 21, Numéro 1, Pages 77-98
2022-08-31

Post-édition De Ta Neuronale à La Dgt Et Qualité Des Textes Finaux : étude De Cas / Neural Machine Translation Post-editing In Dgt And Final Text Quality: A Case Study

Résumé

This article aims at presenting the results of a case study carried out in collaboration with the European Commission’s Directorate General for Translation. This study analyses the quality of contents post-edited from Neural Machine Translation (NMT) proposals (eTranslation NMT engine) by translators with varied translation experience levels. Two types of participants were recruited: “Blue Book” interns (i.e. recently graduated translators taking part in a 5-month paid internship in DGT) and in-house translators. In order to proceed with this analysis, we used an evaluation grid created by French researchers Toudic et al. (2014), and containing nine error categories, as well as four types of effects which guide raters when they attribute severity penalties to errors. The reliability of this tool was verified by an interrater agreement score: 583 revision marks were compared in terms of 1) severity penalty, 2) category and 3) raw MT responsibility by two investigators. As far as methodology is concerned, for each source text, a NMT proposal from the eTranslation engine was post-edited by a DGT translator (10 participants; 7 in-house translators and 3 “Blue Book” interns) and revised by a DGT colleague. This procedure follows the typical DGT workflow: texts are usually first translated by a translator, then systematically revised by a colleague from the same (or sometimes, a different) translation unit. The evaluation of PE text quality was thus carried out through the revision marks introduced in the PE texts. Each of these revision marks was categorised and was attributed a penalty score ranging from 1 (minor) to 5 (critical), according to the perceived distortion of the original message and intention that the source text is supposed to convey. Severity penalties were then normalised using a 100-word basis, in order for the results to be comparable between participants and texts: a total penalty score was computed for each text, and then accordingly divided to reach a 100-word penalty score. These normalised scores enabled us to compare the perceived quality of the texts provided by our participants. Though our results cannot be generalised, since the study presented here is a case study for which no significance score could be computed (not enough data), several conclusions were reached: the overall PE text quality is higher in participants with high experience levels (senior translators) than in junior translators; participants with lower experience levels produce PE texts containing more fidelity and terminology problems than their more experienced counterparts, and professional experience does not seem to have an influence on the proportion of errors directly caused by NMT proposals. Several organisational constraints limited the scope of our study. First, the modest number of participants did not provide for significant results. Hence, a deeper study could be carried on with more volunteers, in order to reach more generalisable results. Secondly, each participant provided us with an uneven number of texts and PE words. This is due to the very nature of our study, in the framework of which translators provided us with texts coming from their daily translation tasks, which limits the quantity of collected data but increases natural validity. Furthermore, the authentic context in which this study was implemented did not enable us to collect process data: further studies could include said data, which would provide for more representative results and provide us with an insight in translators’ cognitive processes when post-editing. In this context, eye-tracking data could be collected, and methods such as questionnaires and think-aloud protocols could be implemented in order to link process data to the quality scores obtained in our study. Finally, studying additional language pairs would be relevant, since NMT quality tends to vary according to these.

Mots clés

Institutional Translation; Neural Machine Translation; Post-Editing; Product; Quality

Exploration De La Traduction Automatique Neuronale Espagnol-français : Pour Une Traductologie De Corpus Appliquée à L’analyse Des Outils De Traduction Exploring Spanish-french Neural Machine Translation : Towards A Corpus-based Translation Studies Approach For The Analysis Of Translation Tools

Valdez Cristian . Lomeña Galiano María .
pages 86-112.

The Importance Of Including Post-editing Of Machine Translation In The Curricula Of Translation Student, In Light Of Other’s Experiences

Bentahar Fares .
pages 115-126.

Etude Comparative Entre La Dtc Neuronale Et La Dtc Basée Sur Les Régulateurs à Hystérésis Neuronale De La Mas Alimentée Par Onduleur Npc à Cinq Niveaux

Benbouhenni H. .
pages 207-216.

Problems Of Arabic-english Machine Translation: Evaluation Of Google’s Online Machine Translation System

Khaldi Anissa .
pages 147-153.

La Réforme De 2003 Et Son Impact Sur La Citoyenneté à Travers Le Manuel De Français De La 3a.s. Dans Son édition De 2007 état Des Lieux / The 2003 Reform And Its Impact On Citizenship Through The 3 In Its 2007 Edition:state Of Arts

Touati Mohammed .
pages 195-201.

Post-édition De Ta Neuronale à La Dgt Et Qualité Des Textes Finaux : étude De Cas / Neural Machine Translation Post-editing In Dgt And Final Text Quality: A Case Study

Résumé

Mots clés

Les articles similaires

Formats de citation