Related papers: Using descriptive mark-up to formalize translation…

Can Automatic Metrics Assess High-Quality Translations?

Automatic metrics for evaluating translation quality are typically validated by measuring how well they correlate with human assessments. However, correlation methods tend to capture only the ability of metrics to differentiate between good…

Computation and Language · Computer Science 2024-10-11 Sweta Agrawal , António Farinhas , Ricardo Rei , André F. T. Martins

Quality Estimation of Machine Translated Texts based on Direct Evidence from Training Data

Current Machine Translation systems achieve very good results on a growing variety of language pairs and data sets. However, it is now well known that they produce fluent translation outputs that often can contain important meaning errors.…

Computation and Language · Computer Science 2023-06-28 Vibhuti Kumari , Narayana Murthy Kavi

Quantifying the Impact of Translation Errors on Multilingual LLM Evaluation

Machine-translated benchmarks are widely used to assess the multilingual capabilities of large language models (LLMs), yet translation errors in these benchmarks remain underexplored, raising concerns about the reliability and comparability…

Computation and Language · Computer Science 2026-05-26 Klaudia-Doris Thellmann , Bernhard Stadler , Michael Färber , Jens Lehmann

Translation Quality Assessment: A Brief Survey on Manual and Automatic Methods

To facilitate effective translation modeling and translation studies, one of the crucial questions to address is how to assess translation quality. From the perspectives of accuracy, reliability, repeatability and cost, translation quality…

Computation and Language · Computer Science 2021-05-10 Lifeng Han , Gareth J. F. Jones , Alan F. Smeaton

Using Machine Translation to Augment Multilingual Classification

An all-too-present bottleneck for text classification model development is the need to annotate training data and this need is multiplied for multilingual classifiers. Fortunately, contemporary machine translation models are both easily…

Computation and Language · Computer Science 2024-05-10 Adam King

Evaluating Language Translation Models by Playing Telephone

Our ability to efficiently and accurately evaluate the quality of machine translation systems has been outrun by the effectiveness of current language models--which limits the potential for further improving these models on more challenging…

Computation and Language · Computer Science 2025-09-25 Syeda Jannatus Saba , Steven Skiena

Span-Level Machine Translation Meta-Evaluation

Machine Translation (MT) and automatic MT evaluation have improved dramatically in recent years, enabling numerous novel applications. Automatic evaluation techniques have evolved from producing scalar quality scores to precisely locating…

Computation and Language · Computer Science 2026-03-23 Stefano Perrella , Eric Morales Agostinho , Hugo Zaragoza

What do the metrics mean? A critical analysis of the use of Automated Evaluation Metrics in Interpreting

With the growth of interpreting technologies, from remote interpreting and Computer-Aided Interpreting to automated speech translation and interpreting avatars, there is now a high demand for ways to quickly and efficiently measure the…

Computation and Language · Computer Science 2026-01-12 Jonathan Downie , Joss Moorkens

Iterative Translation Refinement with Large Language Models

We propose iteratively prompting a large language model to self-correct a translation, with inspiration from their strong language understanding and translation capability as well as a human-like translation approach. Interestingly,…

Computation and Language · Computer Science 2024-05-03 Pinzhen Chen , Zhicheng Guo , Barry Haddow , Kenneth Heafield

Evaluating Optimal Reference Translations

The overall translation quality reached by current machine translation (MT) systems for high-resourced language pairs is remarkably good. Standard methods of evaluation are not suitable nor intended to uncover the many translation errors…

Computation and Language · Computer Science 2024-03-11 Vilém Zouhar , Věra Kloudová , Martin Popel , Ondřej Bojar

Objective Metrics for Evaluating Large Language Models Using External Data Sources

Evaluating the performance of Large Language Models (LLMs) is a critical yet challenging task, particularly when aiming to avoid subjective assessments. This paper proposes a framework for leveraging subjective metrics derived from the…

Computation and Language · Computer Science 2025-08-13 Haoze Du , Richard Li , Edward Gehringer

Quality Estimation without Human-labeled Data

Quality estimation aims to measure the quality of translated content without access to a reference translation. This is crucial for machine translation systems in real-world scenarios where high-quality translation is needed. While many…

Computation and Language · Computer Science 2021-02-09 Yi-Lin Tuan , Ahmed El-Kishky , Adithya Renduchintala , Vishrav Chaudhary , Francisco Guzmán , Lucia Specia

MMTE: Corpus and Metrics for Evaluating Machine Translation Quality of Metaphorical Language

Machine Translation (MT) has developed rapidly since the release of Large Language Models and current MT evaluation is performed through comparison with reference human translations or by predicting quality scores from human-labeled data.…

Computation and Language · Computer Science 2024-11-11 Shun Wang , Ge Zhang , Han Wu , Tyler Loakman , Wenhao Huang , Chenghua Lin

Metric for Automatic Machine Translation Evaluation based on Universal Sentence Representations

Sentence representations can capture a wide range of information that cannot be captured by local features based on character or word N-grams. This paper examines the usefulness of universal sentence representations for evaluating the…

Computation and Language · Computer Science 2018-05-22 Hiroki Shimanaka , Tomoyuki Kajiwara , Mamoru Komachi

Automatic Evaluation Metrics for Document-level Translation: Overview, Challenges and Trends

With the rapid development of deep learning technologies, the field of machine translation has witnessed significant progress, especially with the advent of large language models (LLMs) that have greatly propelled the advancement of…

Computation and Language · Computer Science 2025-04-22 Jiaxin GUO , Xiaoyu Chen , Zhiqiang Rao , Jinlong Yang , Zongyao Li , Hengchao Shang , Daimeng Wei , Hao Yang

A Comparison of Approaches to Document-level Machine Translation

Document-level machine translation conditions on surrounding sentences to produce coherent translations. There has been much recent work in this area with the introduction of custom model architectures and decoding algorithms. This paper…

Computation and Language · Computer Science 2021-01-28 Zhiyi Ma , Sergey Edunov , Michael Auli

Exploring Prediction Uncertainty in Machine Translation Quality Estimation

Machine Translation Quality Estimation is a notoriously difficult task, which lessens its usefulness in real-world translation environments. Such scenarios can be improved if quality predictions are accompanied by a measure of uncertainty.…

Computation and Language · Computer Science 2016-07-01 Daniel Beck , Lucia Specia , Trevor Cohn

Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering

Building a reliable visual question answering~(VQA) system across different languages is a challenging problem, primarily due to the lack of abundant samples for training. To address this challenge, recent studies have employed machine…

Computation and Language · Computer Science 2024-06-05 ChaeHun Park , Koanho Lee , Hyesu Lim , Jaeseok Kim , Junmo Park , Yu-Jung Heo , Du-Seong Chang , Jaegul Choo

Meaningful Pose-Based Sign Language Evaluation

We present a comprehensive study on meaningfully evaluating sign language utterances in the form of human skeletal poses. The study covers keypoint distance-based, embedding-based, and back-translation-based metrics. We show tradeoffs…

Computation and Language · Computer Science 2025-10-10 Zifan Jiang , Colin Leong , Amit Moryossef , Anne Göhring , Annette Rios , Oliver Cory , Maksym Ivashechkin , Neha Tarigopula , Biao Zhang , Rico Sennrich , Sarah Ebling

A High-Quality Multilingual Dataset for Structured Documentation Translation

This paper presents a high-quality multilingual dataset for the documentation domain to advance research on localization of structured text. Unlike widely-used datasets for translation of plain text, we collect XML-structured parallel text…

Computation and Language · Computer Science 2020-06-25 Kazuma Hashimoto , Raffaella Buschiazzo , James Bradbury , Teresa Marshall , Richard Socher , Caiming Xiong