English
Related papers

Related papers: Word Translation Without Parallel Data

200 papers

Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. There have been numerous attempts to extend these successes to low-resource…

Computation and Language · Computer Science 2018-04-16 Guillaume Lample , Alexis Conneau , Ludovic Denoyer , Marc'Aurelio Ranzato

Most existing methods for automatic bilingual dictionary induction rely on prior alignments between the source and target languages, such as parallel corpora or seed dictionaries. For many language pairs, such supervised alignments are not…

Computation and Language · Computer Science 2018-03-26 Hanan Aldarmaki , Mahesh Mohan , Mona Diab

Cross-lingual transfer of word embeddings aims to establish the semantic mappings among words in different languages by learning the transformation functions over the corresponding word embedding spaces. Successfully solving this problem…

Computation and Language · Computer Science 2018-09-12 Ruochen Xu , Yiming Yang , Naoki Otani , Yuexin Wu

Crosslingual word embeddings represent lexical items from different languages in the same vector space, enabling transfer of NLP tools. However, previous attempts had expensive resource requirements, difficulty incorporating monolingual…

Computation and Language · Computer Science 2016-07-01 Long Duong , Hiroshi Kanayama , Tengfei Ma , Steven Bird , Trevor Cohn

Existing models of multilingual sentence embeddings require large parallel data resources which are not available for low-resource languages. We propose a novel unsupervised method to derive multilingual sentence embeddings relying only on…

Computation and Language · Computer Science 2021-05-24 Ivana Kvapilıkova , Mikel Artetxe , Gorka Labaka , Eneko Agirre , Ondřej Bojar

Recent progress on unsupervised learning of cross-lingual embeddings in bilingual setting has given impetus to learning a shared embedding space for several languages without any supervision. A popular framework to solve the latter problem…

Computation and Language · Computer Science 2020-04-21 Pratik Jawanpuria , Mayank Meghwanshi , Bamdev Mishra

Recent work has managed to learn cross-lingual word embeddings without parallel data by mapping monolingual embeddings to a shared space through adversarial training. However, their evaluation has focused on favorable conditions, using…

Computation and Language · Computer Science 2021-12-28 Mikel Artetxe , Gorka Labaka , Eneko Agirre

Recent research has shown that word embedding spaces learned from text corpora of different languages can be aligned without any parallel data supervision. Inspired by the success in unsupervised cross-lingual word embeddings, in this paper…

Computation and Language · Computer Science 2018-09-24 Yu-An Chung , Wei-Hung Weng , Schrasing Tong , James Glass

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language. Recent developments which construct these embeddings…

Computation and Language · Computer Science 2020-03-04 Yerai Doval , Jose Camacho-Collados , Luis Espinosa-Anke , Steven Schockaert

A recent research line has obtained strong results on bilingual lexicon induction by aligning independently trained word embeddings in two languages and using the resulting cross-lingual embeddings to induce word translation pairs through…

Computation and Language · Computer Science 2021-12-28 Mikel Artetxe , Gorka Labaka , Eneko Agirre

We propose an unsupervised method to obtain cross-lingual embeddings without any parallel data or pre-trained word embeddings. The proposed model, which we call multilingual neural language models, takes sentences of multiple languages as…

Computation and Language · Computer Science 2018-09-10 Takashi Wada , Tomoharu Iwata

In this paper, we propose a new task of machine translation (MT), which is based on no parallel sentences but can refer to a ground-truth bilingual dictionary. Motivated by the ability of a monolingual speaker learning to translate via…

Computation and Language · Computer Science 2020-07-07 Xiangyu Duan , Baijun Ji , Hao Jia , Min Tan , Min Zhang , Boxing Chen , Weihua Luo , Yue Zhang

In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs. There have been several proposals to alleviate this issue…

Computation and Language · Computer Science 2018-02-27 Mikel Artetxe , Gorka Labaka , Eneko Agirre , Kyunghyun Cho

Unsupervised cross-lingual word embedding (CLWE) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora. This method relies on the assumption that the two…

Computation and Language · Computer Science 2021-06-04 Sosuke Nishikawa , Ryokan Ri , Yoshimasa Tsuruoka

Cross-lingual word embeddings aim to bridge the gap between high-resource and low-resource languages by allowing to learn multilingual word representations even without using any direct bilingual signal. The lion's share of the methods are…

Computation and Language · Computer Science 2020-09-03 Magdalena Biesialska , Marta R. Costa-jussà

Bilingual lexicons and phrase tables are critical resources for modern Machine Translation systems. Although recent results show that without any seed lexicon or parallel data, highly accurate bilingual lexicons can be learned using…

Computation and Language · Computer Science 2020-09-01 Ashiqur R. KhudaBukhsh , Shriphani Palakodety , Tom M. Mitchell

The availability of parallel sentence simplification (SS) is scarce for neural SS modelings. We propose an unsupervised method to build SS corpora from large-scale bilingual translation corpora, alleviating the need for SS supervised…

Computation and Language · Computer Science 2021-09-02 Xinyu Lu , Jipeng Qiang , Yun Li , Yunhao Yuan , Yi Zhu

Cross-lingual word embeddings aim to capture common linguistic regularities of different languages, which benefit various downstream tasks ranging from machine translation to transfer learning. Recently, it has been shown that these…

Computation and Language · Computer Science 2018-11-02 Pengcheng Yang , Fuli Luo , Shuangzhi Wu , Jingjing Xu , Dongdong Zhang , Xu Sun

We consider the problem of aligning two sets of continuous word representations, corresponding to languages, to a common space in order to infer a bilingual lexicon. It was recently shown that it is possible to infer such lexicon, without…

Computation and Language · Computer Science 2024-02-13 Paul Garnier , Gauthier Guinet

We propose a novel model architecture and training algorithm to learn bilingual sentence embeddings from a combination of parallel and monolingual data. Our method connects autoencoding and neural machine translation to force the source and…

Computation and Language · Computer Science 2019-06-06 Yunsu Kim , Hendrik Rosendahl , Nick Rossenbach , Jan Rosendahl , Shahram Khadivi , Hermann Ney
‹ Prev 1 2 3 10 Next ›