Related papers: A General-Purpose Multilingual Document Encoder

Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval

Pretrained multilingual text encoders based on neural Transformer architectures, such as multilingual BERT (mBERT) and XLM, have achieved strong performance on a myriad of language understanding tasks. Consequently, they have been adopted…

Computation and Language · Computer Science 2021-01-22 Robert Litschko , Ivan Vulić , Simone Paolo Ponzetto , Goran Glavaš

A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization

Pre-trained language models (PLMs) have achieved outstanding achievements in abstractive single-document summarization (SDS). However, such benefits may not fully extend to multi-document summarization (MDS), where the handling of…

Computation and Language · Computer Science 2023-11-02 Chenhui Shen , Liying Cheng , Xuan-Phi Nguyen , Yang You , Lidong Bing

Massively Multilingual Lexical Specialization of Multilingual Transformers

While pretrained language models (PLMs) primarily serve as general-purpose text encoders that can be fine-tuned for a wide variety of downstream tasks, recent work has shown that they can also be rewired to produce high-quality word…

Computation and Language · Computer Science 2023-05-30 Tommaso Green , Simone Paolo Ponzetto , Goran Glavaš

Memory Reviving, Continuing Learning and Beyond: Evaluation of Pre-trained Encoders and Decoders for Multimodal Machine Translation

Multimodal Machine Translation (MMT) aims to improve translation quality by leveraging auxiliary modalities such as images alongside textual input. While recent advances in large-scale pre-trained language and vision models have…

Computation and Language · Computer Science 2025-04-28 Zhuang Yu , Shiliang Sun , Jing Zhao , Tengfei Song , Hao Yang

DOCmT5: Document-Level Pretraining of Multilingual Language Models

In this paper, we introduce DOCmT5, a multilingual sequence-to-sequence language model pretrained with large scale parallel documents. While previous approaches have focused on leveraging sentence-level parallel data, we try to build a…

Computation and Language · Computer Science 2022-05-06 Chia-Hsuan Lee , Aditya Siddhant , Viresh Ratnakar , Melvin Johnson

Exploring Unsupervised Pretraining Objectives for Machine Translation

Unsupervised cross-lingual pretraining has achieved strong results in neural machine translation (NMT), by drastically reducing the need for large parallel data. Most approaches adapt masked-language modeling (MLM) to sequence-to-sequence…

Computation and Language · Computer Science 2021-06-11 Christos Baziotis , Ivan Titov , Alexandra Birch , Barry Haddow

Beyond Decoder-only: Large Language Models Can be Good Encoders for Machine Translation

The field of neural machine translation (NMT) has changed with the advent of large language models (LLMs). Much of the recent emphasis in natural language processing (NLP) has been on modeling machine translation and many other problems…

Computation and Language · Computer Science 2025-06-03 Yingfeng Luo , Tong Zheng , Yongyu Mu , Bei Li , Qinghong Zhang , Yongqi Gao , Ziqiang Xu , Peinan Feng , Xiaoqian Liu , Tong Xiao , Jingbo Zhu

HLT-MT: High-resource Language-specific Training for Multilingual Neural Machine Translation

Multilingual neural machine translation (MNMT) trained in multiple language pairs has attracted considerable attention due to fewer model parameters and lower training costs by sharing knowledge among multiple languages. Nonetheless,…

Computation and Language · Computer Science 2022-07-21 Jian Yang , Yuwei Yin , Shuming Ma , Dongdong Zhang , Zhoujun Li , Furu Wei

Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification

Long Document Classification (LDC) has gained significant attention recently. However, multi-modal data in long documents such as texts and images are not being effectively utilized. Prior studies in this area have attempted to integrate…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Tengfei Liu , Yongli Hu , Junbin Gao , Yanfeng Sun , Baocai Yin

Hierarchical Document Encoder for Parallel Corpus Mining

We explore using multilingual document embeddings for nearest neighbor mining of parallel data. Three document-level representations are investigated: (i) document embeddings generated by simply averaging multilingual sentence embeddings;…

Computation and Language · Computer Science 2019-07-02 Mandy Guo , Yinfei Yang , Keith Stevens , Daniel Cer , Heming Ge , Yun-Hsuan Sung , Brian Strope , Ray Kurzweil

Breaking Down Multilingual Machine Translation

While multilingual training is now an essential ingredient in machine translation (MT) systems, recent work has demonstrated that it has different effects in different multilingual settings, such as many-to-one, one-to-many, and…

Computation and Language · Computer Science 2022-04-06 Ting-Rui Chiang , Yi-Pei Chen , Yi-Ting Yeh , Graham Neubig

Multilingual Neural Machine Translation With Soft Decoupled Encoding

Multilingual training of neural machine translation (NMT) systems has led to impressive accuracy improvements on low-resource languages. However, there are still significant challenges in efficiently learning word representations in the…

Computation and Language · Computer Science 2019-02-12 Xinyi Wang , Hieu Pham , Philip Arthur , Graham Neubig

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

Multilingual machine translation enables a single model to translate between different languages. Most existing multilingual machine translation systems adopt a randomly initialized Transformer backbone. In this work, inspired by the recent…

Computation and Language · Computer Science 2021-01-01 Shuming Ma , Jian Yang , Haoyang Huang , Zewen Chi , Li Dong , Dongdong Zhang , Hany Hassan Awadalla , Alexandre Muzio , Akiko Eriguchi , Saksham Singhal , Xia Song , Arul Menezes , Furu Wei

Microsoft Translator at WMT 2019: Towards Large-Scale Document-Level Neural Machine Translation

This paper describes the Microsoft Translator submissions to the WMT19 news translation shared task for English-German. Our main focus is document-level neural machine translation with deep transformer models. We start with strong…

Computation and Language · Computer Science 2019-07-16 Marcin Junczys-Dowmunt

Bridging the Linguistic Divide: A Survey on Leveraging Large Language Models for Machine Translation

Large Language Models (LLMs) are rapidly reshaping machine translation (MT), particularly by introducing instruction-following, in-context learning, and preference-based alignment into what has traditionally been a supervised…

Computation and Language · Computer Science 2026-04-29 Baban Gain , Dibyanayan Bandyopadhyay , Asif Ekbal , Trilok Nath Singh

On Cross-Lingual Retrieval with Multilingual Text Encoders

In this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a number of diverse language pairs. We first treat…

Computation and Language · Computer Science 2021-12-22 Robert Litschko , Ivan Vulić , Simone Paolo Ponzetto , Goran Glavaš

Supervised Contrastive Learning for Interpretable Long-Form Document Matching

Recent advancements in deep learning techniques have transformed the area of semantic text matching. However, most state-of-the-art models are designed to operate with short documents such as tweets, user reviews, comments, etc. These…

Information Retrieval · Computer Science 2022-06-03 Akshita Jha , Vineeth Rakesh , Jaideep Chandrashekar , Adithya Samavedhi , Chandan K. Reddy

Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders

State-of-the-art multilingual machine translation relies on a universal encoder-decoder, which requires retraining the entire system to add new languages. In this paper, we propose an alternative approach that is based on language-specific…

Computation and Language · Computer Science 2020-04-15 Carlos Escolano , Marta R. Costa-jussà , José A. R. Fonollosa , Mikel Artetxe

Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder

In this paper, we present our first attempts in building a multilingual Neural Machine Translation framework under a unified approach. We are then able to employ attention-based NMT for many-to-many multilingual translation tasks. Our…

Computation and Language · Computer Science 2016-11-16 Thanh-Le Ha , Jan Niehues , Alexander Waibel

Towards a Multi-modal, Multi-task Learning based Pre-training Framework for Document Representation Learning

Recent approaches in literature have exploited the multi-modal information in documents (text, layout, image) to serve specific downstream document tasks. However, they are limited by their - (i) inability to learn cross-modal…

Computation and Language · Computer Science 2022-01-06 Subhojeet Pramanik , Shashank Mujumdar , Hima Patel