Please use this identifier to cite or link to this item: http://doi.org/10.25358/openscience-7229
Authors: Marzouk, Shaimaa
Title: An in-depth analysis of the individual impact of controlled language rules on machine translation output : a mixed-methods approach
Online publication date: 27-Jun-2022
Year of first publication: 2021
Language: english
Abstract: Examining the general impact of Controlled Language (CL) rules in the context of Machine Translation (MT) has been an area of research for many years. The present study focuses on the following question: how do CL rules impact MT output individually? By analysing a German corpus-based test suite of technical texts that have been translated into English by different MT systems, this study endeavours to answer this question at different levels: the general impact of CL rules (rule- and system-independent), their impact at rule level (system-independent) as well as at rule and system level. The results of five MT systems are analysed and contrasted: a rule-based system, a statistical system, two differently constructed hybrid systems, and a neural system. For this, a mixed-methods triangulation approach that includes error annotation, human evaluation, and automatic evaluation was applied. The data was analysed both qualitatively and quantitatively in terms of CL influence on the following parameters: number and type of MT errors, style and content quality, and scores of two automatic evaluation metrics. In line with many studies, the results show a general positive impact of the applied CL rules on the MT output. However, at rule level, only four rules proved to have positive effects on the aforementioned parameters; three rules had negative effects on the parameters; and two rules did not show any significant impact. At rule and system level, the rules affected the MT systems differently, as expected. Rules that had a positive impact on earlier MT approaches did not show the same impact on the neural MT approach. Furthermore, neural MT delivered distinctly better results than earlier MT approaches, namely the highest error-free, style and content quality rates both before and after applying the rules, which indicates that neural MT offers a promising solution that no longer requires CL rules for improving the MT output.
DDC: 400 Sprache
400 Language
Institution: Johannes Gutenberg-Universität Mainz
Department: FB 06 Translations-, Sprach- und Kulturwissenschaft
Place: Mainz
ROR: https://ror.org/023b0x485
DOI: http://doi.org/10.25358/openscience-7229
Version: Published version
Publication type: Zeitschriftenaufsatz
License: CC BY
Information on rights of use: https://creativecommons.org/licenses/by/4.0/
Journal: Machine translation
35
Pages or article number: 167
203
Publisher: Springer Science + Business Media B.V.
Publisher place: Dordrecht u.a.
Issue date: 2021
ISSN: 1573-0573
Publisher DOI: 10.1007/s10590-021-09266-0
Appears in collections:JGU-Publikationen

Files in This Item:
  File Description SizeFormat
Thumbnail
an_indepth_analysis_of_the_in-20220627094043942.pdf2.88 MBAdobe PDFView/Open