Related papers: KoBE: Knowledge-Based Machine Translation Evaluati…

Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference

Machine translation quality estimation (QE) predicts human judgements of a translation hypothesis without seeing the reference. State-of-the-art QE systems based on pretrained language models have been achieving remarkable correlations with…

Computation and Language · Computer Science 2023-04-26 Vilém Zouhar , Shehzaad Dhuliawala , Wangchunshu Zhou , Nico Daheim , Tom Kocmi , Yuchen Eleanor Jiang , Mrinmaya Sachan

BLEU might be Guilty but References are not Innocent

The quality of automatic metrics for machine translation has been increasingly called into question, especially for high-quality systems. This paper demonstrates that, while choice of metric is important, the nature of the references is…

Computation and Language · Computer Science 2020-10-21 Markus Freitag , David Grangier , Isaac Caswell

MT-Ranker: Reference-free machine translation evaluation by inter-system ranking

Traditionally, Machine Translation (MT) Evaluation has been treated as a regression problem -- producing an absolute translation-quality score. This approach has two limitations: i) the scores lack interpretability, and human annotators…

Computation and Language · Computer Science 2024-01-31 Ibraheem Muhammad Moosa , Rui Zhang , Wenpeng Yin

"Bilingual Expert" Can Find Translation Errors

Recent advances in statistical machine translation via the adoption of neural sequence-to-sequence models empower the end-to-end system to achieve state-of-the-art in many WMT benchmarks. The performance of such machine translation (MT)…

Computation and Language · Computer Science 2018-11-20 Kai Fan , Jiayi Wang , Bo Li , Fengming Zhou , Boxing Chen , Luo Si

Assessing Reference-Free Peer Evaluation for Machine Translation

Reference-free evaluation has the potential to make machine translation evaluation substantially more scalable, allowing us to pivot easily to new languages or domains. It has been recently shown that the probabilities given by a large,…

Computation and Language · Computer Science 2021-04-13 Sweta Agrawal , George Foster , Markus Freitag , Colin Cherry

Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation

Neural machine translation systems estimate probabilities of target sentences given source sentences, yet these estimates may not align with human preferences. This work introduces QE-fusion, a method that synthesizes translations using a…

Computation and Language · Computer Science 2024-06-07 Giorgos Vernikos , Andrei Popescu-Belis

Pre-training and Diagnosing Knowledge Base Completion Models

In this work, we introduce and analyze an approach to knowledge transfer from one collection of facts to another without the need for entity or relation matching. The method works for both canonicalized knowledge bases and uncanonicalized…

Computation and Language · Computer Science 2024-02-20 Vid Kocijan , Myeongjun Erik Jang , Thomas Lukasiewicz

Online Learning Meets Machine Translation Evaluation: Finding the Best Systems with the Least Human Effort

In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging. Automatic metrics are not reliable when it comes to high performing systems. In addition, resorting to human evaluators can be…

Computation and Language · Computer Science 2021-05-31 Vânia Mendonça , Ricardo Rei , Luisa Coheur , Alberto Sardinha , Ana Lúcia Santos

Building a Functional Machine Translation Corpus for Kpelle

In this paper, we introduce the first publicly available English-Kpelle dataset for machine translation, comprising over 2000 sentence pairs drawn from everyday communication, religious texts, and educational materials. By fine-tuning…

Computation and Language · Computer Science 2025-05-27 Kweku Andoh Yamoah , Jackson Weako , Emmanuel J. Dorley

LLM-Based Evaluation of Low-Resource Machine Translation: A Reference-less Dialect Guided Approach with a Refined Sylheti-English Benchmark

Evaluating machine translation (MT) for low-resource languages poses a persistent challenge, primarily due to the limited availability of high quality reference translations. This issue is further exacerbated in languages with multiple…

Computation and Language · Computer Science 2025-05-20 Md. Atiqur Rahman , Sabrina Islam , Mushfiqul Haque Omi

Knowledge-Prompted Estimator: A Novel Approach to Explainable Machine Translation Assessment

Cross-lingual Machine Translation (MT) quality estimation plays a crucial role in evaluating translation performance. GEMBA, the first MT quality assessment metric based on Large Language Models (LLMs), employs one-step prompting to achieve…

Computation and Language · Computer Science 2023-06-14 Hao Yang , Min Zhang , Shimin Tao , Minghan Wang , Daimeng Wei , Yanfei Jiang

Phrase-Based & Neural Unsupervised Machine Translation

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of…

Computation and Language · Computer Science 2018-08-15 Guillaume Lample , Myle Ott , Alexis Conneau , Ludovic Denoyer , Marc'Aurelio Ranzato

Revisiting Round-Trip Translation for Quality Estimation

Quality estimation (QE) is the task of automatically evaluating the quality of translations without human-translated references. Calculating BLEU between the input sentence and round-trip translation (RTT) was once considered as a metric…

Computation and Language · Computer Science 2020-04-30 Jihyung Moon , Hyunchang Cho , Eunjeong L. Park

Embedding Multimodal Relational Data for Knowledge Base Completion

Representing entities and relations in an embedding space is a well-studied approach for machine learning on relational data. Existing approaches, however, primarily focus on simple link structure between a finite set of entities, ignoring…

Artificial Intelligence · Computer Science 2018-09-11 Pouya Pezeshkpour , Liyan Chen , Sameer Singh

Machine Translation Evaluation using Bi-directional Entailment

In this paper, we propose a new metric for Machine Translation (MT) evaluation, based on bi-directional entailment. We show that machine generated translation can be evaluated by determining paraphrasing with a reference translation…

Computation and Language · Computer Science 2019-11-05 Rakesh Khobragade , Heaven Patel , Anand Namdev , Anish Mishra , Pushpak Bhattacharyya

MELO: An Evaluation Benchmark for Multilingual Entity Linking of Occupations

We present the Multilingual Entity Linking of Occupations (MELO) Benchmark, a new collection of 48 datasets for evaluating the linking of entity mentions in 21 languages to the ESCO Occupations multilingual taxonomy. MELO was built using…

Computation and Language · Computer Science 2024-10-14 Federico Retyk , Luis Gasco , Casimiro Pio Carrino , Daniel Deniz , Rabih Zbib

MILE-RefHumEval: A Reference-Free, Multi-Independent LLM Framework for Human-Aligned Evaluation

We introduce MILE-RefHumEval, a reference-free framework for evaluating Large Language Models (LLMs) without ground-truth annotations or evaluator coordination. It leverages an ensemble of independently prompted evaluators guided by a…

Computation and Language · Computer Science 2026-02-11 Nalin Srun , Parisa Rastin , Guénaël Cabanes , Lydia Boudjeloud Assala

Quality Estimation without Human-labeled Data

Quality estimation aims to measure the quality of translated content without access to a reference translation. This is crucial for machine translation systems in real-world scenarios where high-quality translation is needed. While many…

Computation and Language · Computer Science 2021-02-09 Yi-Lin Tuan , Ahmed El-Kishky , Adithya Renduchintala , Vishrav Chaudhary , Francisco Guzmán , Lucia Specia

Context-Aware Monolingual Human Evaluation of Machine Translation

This paper explores the potential of context-aware monolingual human evaluation for assessing machine translation (MT) when no source is given for reference. To this end, we compare monolingual with bilingual evaluations (with source text),…

Computation and Language · Computer Science 2025-04-11 Silvio Picinini , Sheila Castilho

Reference-less Quality Estimation of Text Simplification Systems

The evaluation of text simplification (TS) systems remains an open challenge. As the task has common points with machine translation (MT), TS is often evaluated using MT metrics such as BLEU. However, such metrics require high quality…

Computation and Language · Computer Science 2019-01-31 Louis Martin , Samuel Humeau , Pierre-Emmanuel Mazaré , Antoine Bordes , Éric Villemonte de La Clergerie , Benoît Sagot