Related papers: Simultaneous Machine Translation with Visual Conte…

Exploiting Multimodal Reinforcement Learning for Simultaneous Machine Translation

This paper addresses the problem of simultaneous machine translation (SiMT) by exploring two main concepts: (a) adaptive policies to learn a good trade-off between high translation quality and low latency; and (b) visual information to…

Computation and Language · Computer Science 2021-02-24 Julia Ive , Andy Mingren Li , Yishu Miao , Ozan Caglayan , Pranava Madhyastha , Lucia Specia

Towards Multimodal Simultaneous Neural Machine Translation

Simultaneous translation involves translating a sentence before the speaker's utterance is completed in order to realize real-time understanding in multiple languages. This task is significantly more challenging than the general full…

Computation and Language · Computer Science 2020-10-26 Aizhan Imankulova , Masahiro Kaneko , Tosho Hirasawa , Mamoru Komachi

Probing the Need for Visual Context in Multimodal Machine Translation

Current work on multimodal machine translation (MMT) has suggested that the visual modality is either unnecessary or only marginally beneficial. We posit that this is a consequence of the very simple, short and repetitive sentences used in…

Computation and Language · Computer Science 2019-06-04 Ozan Caglayan , Pranava Madhyastha , Lucia Specia , Loïc Barrault

Vision Matters When It Should: Sanity Checking Multimodal Machine Translation Models

Multimodal machine translation (MMT) systems have been shown to outperform their text-only neural machine translation (NMT) counterparts when visual context is available. However, recent studies have also shown that the performance of MMT…

Computation and Language · Computer Science 2021-09-09 Jiaoda Li , Duygu Ataman , Rico Sennrich

Supervised Visual Attention for Simultaneous Multimodal Machine Translation

Recently, there has been a surge in research in multimodal machine translation (MMT), where additional modalities such as images are used to improve translation quality of textual systems. A particular use for such multimodal systems is the…

Computation and Language · Computer Science 2022-07-07 Veneta Haralampieva , Ozan Caglayan , Lucia Specia

CaMMT: Benchmarking Culturally Aware Multimodal Machine Translation

Translating cultural content poses challenges for machine translation systems due to the differences in conceptualizations between cultures, where language alone may fail to convey sufficient context to capture region-specific meanings. In…

Computation and Language · Computer Science 2025-09-23 Emilio Villa-Cueva , Sholpan Bolatzhanova , Diana Turmakhan , Kareem Elzeky , Henok Biadglign Ademtew , Alham Fikri Aji , Vladimir Araujo , Israel Abebe Azime , Jinheon Baek , Frederico Belcavello , Fermin Cristobal , Jan Christian Blaise Cruz , Mary Dabre , Raj Dabre , Toqeer Ehsan , Naome A Etori , Fauzan Farooqui , Jiahui Geng , Guido Ivetta , Thanmay Jayakumar , Soyeong Jeong , Zheng Wei Lim , Aishik Mandal , Sofia Martinelli , Mihail Minkov Mihaylov , Daniil Orel , Aniket Pramanick , Sukannya Purkayastha , Israfel Salazar , Haiyue Song , Tiago Timponi Torrent , Debela Desalegn Yadeta , Injy Hamed , Atnafu Lambebo Tonja , Thamar Solorio

Impact of Visual Context on Noisy Multimodal NMT: An Empirical Study for English to Indian Languages

Neural Machine Translation (NMT) has made remarkable progress using large-scale textual data, but the potential of incorporating multimodal inputs, especially visual information, remains underexplored in high-resource settings. While prior…

Computation and Language · Computer Science 2025-10-31 Baban Gain , Dibyanayan Bandyopadhyay , Samrat Mukherjee , Chandranath Adak , Asif Ekbal

Exploring the Necessity of Visual Modality in Multimodal Machine Translation using Authentic Datasets

Recent research in the field of multimodal machine translation (MMT) has indicated that the visual modality is either dispensable or offers only marginal advantages. However, most of these conclusions are drawn from the analysis of…

Computation and Language · Computer Science 2024-04-10 Zi Long , Zhenhao Tang , Xianghua Fu , Jian Chen , Shilong Hou , Jinze Lyu

Context Consistency between Training and Testing in Simultaneous Machine Translation

Simultaneous Machine Translation (SiMT) aims to yield a real-time partial translation with a monotonically growing the source-side context. However, there is a counterintuitive phenomenon about the context usage between training and…

Computation and Language · Computer Science 2023-11-14 Meizhi Zhong , Lemao Liu , Kehai Chen , Mingming Yang , Min Zhang

Increasing Visual Awareness in Multimodal Neural Machine Translation from an Information Theoretic Perspective

Multimodal machine translation (MMT) aims to improve translation quality by equipping the source sentence with its corresponding image. Despite the promising performance, MMT models still suffer the problem of input degradation: models…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Baijun Ji , Tong Zhang , Yicheng Zou , Bojie Hu , Si Shen

Reducing Position Bias in Simultaneous Machine Translation with Length-Aware Framework

Simultaneous machine translation (SiMT) starts translating while receiving the streaming source inputs, and hence the source sentence is always incomplete during translating. Different from the full-sentence MT using the conventional…

Computation and Language · Computer Science 2022-03-24 Shaolei Zhang , Yang Feng

Unsupervised Multi-modal Neural Machine Translation

Unsupervised neural machine translation (UNMT) has recently achieved remarkable results with only large monolingual corpora in each language. However, the uncertainty of associating target with source sentences makes UNMT theoretically an…

Computer Vision and Pattern Recognition · Computer Science 2019-05-28 Yuanhang Su , Kai Fan , Nguyen Bach , C. -C. Jay Kuo , Fei Huang

Neural Machine Translation with Phrase-Level Universal Visual Representations

Multimodal machine translation (MMT) aims to improve neural machine translation (NMT) with additional visual information, but most existing MMT methods require paired input of source sentence and image, which makes them suffer from shortage…

Computation and Language · Computer Science 2022-03-22 Qingkai Fang , Yang Feng

Good for Misconceived Reasons: An Empirical Revisiting on the Need for Visual Context in Multimodal Machine Translation

A neural multimodal machine translation (MMT) system is one that aims to perform better translation by extending conventional text-only translation models with multimodal information. Many recent studies report improvements when equipping…

Computation and Language · Computer Science 2021-06-01 Zhiyong Wu , Lingpeng Kong , Wei Bi , Xiang Li , Ben Kao

Learning to Translate in Real-time with Neural Machine Translation

Translating in real-time, a.k.a. simultaneous translation, outputs translation words before the input sentence ends, which is a challenging problem for conventional machine translation methods. We propose a neural machine translation (NMT)…

Computation and Language · Computer Science 2017-01-12 Jiatao Gu , Graham Neubig , Kyunghyun Cho , Victor O. K. Li

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

Simultaneous Machine Translation is the task of incrementally translating an input sentence before it is fully available. Currently, simultaneous translation is carried out by translating each sentence independently of the previously…

Computation and Language · Computer Science 2022-04-01 Javier Iranzo-Sánchez , Jorge Civera , Alfons Juan

LVP-M3: Language-aware Visual Prompt for Multilingual Multimodal Machine Translation

Multimodal Machine Translation (MMT) focuses on enhancing text-only translation with visual features, which has attracted considerable attention from both natural language processing and computer vision communities. Recent advances still…

Computation and Language · Computer Science 2022-11-29 Hongcheng Guo , Jiaheng Liu , Haoyang Huang , Jian Yang , Zhoujun Li , Dongdong Zhang , Zheng Cui , Furu Wei

Beyond Triplet: Leveraging the Most Data for Multimodal Machine Translation

Multimodal machine translation (MMT) aims to improve translation quality by incorporating information from other modalities, such as vision. Previous MMT systems mainly focus on better access and use of visual information and tend to…

Computation and Language · Computer Science 2023-09-06 Yaoming Zhu , Zewei Sun , Shanbo Cheng , Luyang Huang , Liwei Wu , Mingxuan Wang

Incorporating Global Visual Features into Attention-Based Neural Machine Translation

We introduce multi-modal, attention-based neural machine translation (NMT) models which incorporate visual features into different parts of both the encoder and the decoder. We utilise global image features extracted using a pre-trained…

Computation and Language · Computer Science 2017-01-24 Iacer Calixto , Qun Liu , Nick Campbell

Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting

Unsupervised machine translation (MT) has recently achieved impressive results with monolingual corpora only. However, it is still challenging to associate source-target sentences in the latent space. As people speak different languages…

Computation and Language · Computer Science 2020-05-08 Po-Yao Huang , Junjie Hu , Xiaojun Chang , Alexander Hauptmann