An in-depth analysis of the individual impact of controlled language rules on machine translation output : a mixed-methods approach

Marzouk, Shaimaa

doi:http://doi.org/10.25358/openscience-7229

An in-depth analysis of the individual impact of controlled language rules on machine translation output : a mixed-methods approach

Files

an_indepth_analysis_of_the_in-20220627094043942.pdf (2.82 MB)

Date issued

2021

Authors

Marzouk, Shaimaa

Reuse License

Description of rights: CC-BY-4.0

Item

Zeitschriftenaufsatz

Open Access

Abstract

Examining the general impact of Controlled Language (CL) rules in the context of Machine Translation (MT) has been an area of research for many years. The present study focuses on the following question: how do CL rules impact MT output individually? By analysing a German corpus-based test suite of technical texts that have been translated into English by different MT systems, this study endeavours to answer this question at different levels: the general impact of CL rules (rule- and system-independent), their impact at rule level (system-independent) as well as at rule and system level. The results of five MT systems are analysed and contrasted: a rule-based system, a statistical system, two differently constructed hybrid systems, and a neural system. For this, a mixed-methods triangulation approach that includes error annotation, human evaluation, and automatic evaluation was applied. The data was analysed both qualitatively and quantitatively in terms of CL influence on the following parameters: number and type of MT errors, style and content quality, and scores of two automatic evaluation metrics. In line with many studies, the results show a general positive impact of the applied CL rules on the MT output. However, at rule level, only four rules proved to have positive effects on the aforementioned parameters; three rules had negative effects on the parameters; and two rules did not show any significant impact. At rule and system level, the rules affected the MT systems differently, as expected. Rules that had a positive impact on earlier MT approaches did not show the same impact on the neural MT approach. Furthermore, neural MT delivered distinctly better results than earlier MT approaches, namely the highest error-free, style and content quality rates both before and after applying the rules, which indicates that neural MT offers a promising solution that no longer requires CL rules for improving the MT output.

DOI

http://doi.org/10.25358/openscience-7229

URI

https://openscience.ub.uni-mainz.de/handle/20.500.12030/7243

Published in

Machine translation, 35, Springer Science + Business Media B.V., Dordrecht u.a., 2021, https://doi.org/10.1007/s10590-021-09266-0

Collections

JGU-Publikationen

Full item page

An in-depth analysis of the individual impact of controlled language rules on machine translation output : a mixed-methods approach

Files

Date issued

Authors

Editors

Journal Title

Journal ISSN

Volume Title

Publisher

Reuse License

Abstract

DOI

Description

Keywords

Citation

URI

Published in

Relationships

Collections

Endorsement

Review

Supplemented By

Referenced By