Related papers: Phrase-Based Attentions

Effective Approaches to Attention-based Neural Machine Translation

An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for…

Computation and Language · Computer Science 2015-09-22 Minh-Thang Luong , Hieu Pham , Christopher D. Manning

Neural Machine Translation Leveraging Phrase-based Models in a Hybrid Search

In this paper, we introduce a hybrid search for attention-based neural machine translation (NMT). A target phrase learned with statistical MT models extends a hypothesis in the NMT beam search when the attention of the NMT model focuses on…

Computation and Language · Computer Science 2017-08-11 Leonard Dahlmann , Evgeny Matusov , Pavel Petrushkov , Shahram Khadivi

Weighted Transformer Network for Machine Translation

State-of-the-art results on neural machine translation often use attentional sequence-to-sequence models with some form of convolution or recursion. Vaswani et al. (2017) propose a new architecture that avoids recurrence and convolution…

Artificial Intelligence · Computer Science 2017-11-08 Karim Ahmed , Nitish Shirish Keskar , Richard Socher

Learning Source Phrase Representations for Neural Machine Translation

The Transformer translation model (Vaswani et al., 2017) based on a multi-head attention mechanism can be computed effectively in parallel and has significantly pushed forward the performance of Neural Machine Translation (NMT). Though…

Computation and Language · Computer Science 2020-06-26 Hongfei Xu , Josef van Genabith , Deyi Xiong , Qiuhui Liu , Jingyi Zhang

Syntax-Infused Transformer and BERT models for Machine Translation and Natural Language Understanding

Attention-based models have shown significant improvement over traditional algorithms in several NLP tasks. The Transformer, for instance, is an illustrative example that generates abstract representations of tokens inputted to an encoder…

Computation and Language · Computer Science 2019-11-15 Dhanasekar Sundararaman , Vivek Subramanian , Guoyin Wang , Shijing Si , Dinghan Shen , Dong Wang , Lawrence Carin

Learning When to Attend for Neural Machine Translation

In the past few years, attention mechanisms have become an indispensable component of end-to-end neural machine translation models. However, previous attention models always refer to some source words when predicting a target word, which…

Computation and Language · Computer Science 2017-06-01 Junhui Li , Muhua Zhu

Transformer++

Recent advancements in attention mechanisms have replaced recurrent neural networks and its variants for machine translation tasks. Transformer using attention mechanism solely achieved state-of-the-art results in sequence modeling. Neural…

Computation and Language · Computer Science 2020-04-02 Prakhar Thapak , Prodip Hore

Pre-Translation for Neural Machine Translation

Recently, the development of neural machine translation (NMT) has significantly improved the translation quality of automatic machine translation. While most sentences are more accurate and fluent than translations by statistical machine…

Computation and Language · Computer Science 2016-10-18 Jan Niehues , Eunah Cho , Thanh-Le Ha , Alex Waibel

Selective Attention for Context-aware Neural Machine Translation

Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document. Recent works in context-aware NMT consider only a few previous sentences as context and may…

Computation and Language · Computer Science 2019-05-27 Sameen Maruf , André F. T. Martins , Gholamreza Haffari

Attention Is All You Need

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism.…

Computation and Language · Computer Science 2023-08-03 Ashish Vaswani , Noam Shazeer , Niki Parmar , Jakob Uszkoreit , Llion Jones , Aidan N. Gomez , Lukasz Kaiser , Illia Polosukhin

Tree-to-Sequence Attentional Neural Machine Translation

Most of the existing Neural Machine Translation (NMT) models focus on the conversion of sequential data and do not directly use syntactic information. We propose a novel end-to-end syntactic NMT model, extending a sequence-to-sequence model…

Computation and Language · Computer Science 2016-06-09 Akiko Eriguchi , Kazuma Hashimoto , Yoshimasa Tsuruoka

Neural Machine Translation with Key-Value Memory-Augmented Attention

Although attention-based Neural Machine Translation (NMT) has achieved remarkable progress in recent years, it still suffers from issues of repeating and dropping translations. To alleviate these issues, we propose a novel key-value…

Computation and Language · Computer Science 2018-07-02 Fandong Meng , Zhaopeng Tu , Yong Cheng , Haiyang Wu , Junjie Zhai , Yuekui Yang , Di Wang

Interactive Attention for Neural Machine Translation

Conventional attention-based Neural Machine Translation (NMT) conducts dynamic alignment in generating the target sentence. By repeatedly reading the representation of source sentence, which keeps fixed after generated by the encoder…

Computation and Language · Computer Science 2016-10-18 Fandong Meng , Zhengdong Lu , Hang Li , Qun Liu

The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT

This paper describes the AMU-UEDIN submissions to the WMT 2016 shared task on news translation. We explore methods of decode-time integration of attention-based neural translation models with phrase-based statistical machine translation.…

Computation and Language · Computer Science 2016-06-24 Marcin Junczys-Dowmunt , Tomasz Dwojak , Rico Sennrich

Interrogating the Explanatory Power of Attention in Neural Machine Translation

Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model's decision in generating a specific token but it has not yet been rigorously established…

Computation and Language · Computer Science 2019-10-02 Pooya Moradi , Nishant Kambhatla , Anoop Sarkar

Neural Machine Translation with Recurrent Attention Modeling

Knowing which words have been attended to in previous time steps while generating a translation is a rich source of information for predicting what words will be attended to in the future. We improve upon the attention model of Bahdanau et…

Neural and Evolutionary Computing · Computer Science 2016-07-19 Zichao Yang , Zhiting Hu , Yuntian Deng , Chris Dyer , Alex Smola

Enhancing Machine Translation with Dependency-Aware Self-Attention

Most neural machine translation models only rely on pairs of parallel sentences, assuming syntactic information is automatically learned by an attention mechanism. In this work, we investigate different approaches to incorporate syntactic…

Computation and Language · Computer Science 2020-04-22 Emanuele Bugliarello , Naoaki Okazaki

Phrase-Based & Neural Unsupervised Machine Translation

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of…

Computation and Language · Computer Science 2018-08-15 Guillaume Lample , Myle Ott , Alexis Conneau , Ludovic Denoyer , Marc'Aurelio Ranzato

Neural Phrase-to-Phrase Machine Translation

In this paper, we propose Neural Phrase-to-Phrase Machine Translation (NP$^2$MT). Our model uses a phrase attention mechanism to discover relevant input (source) segments that are used by a decoder to generate output (target) phrases. We…

Computation and Language · Computer Science 2018-11-07 Jiangtao Feng , Lingpeng Kong , Po-Sen Huang , Chong Wang , Da Huang , Jiayuan Mao , Kan Qiao , Dengyong Zhou

Training Deeper Neural Machine Translation Models with Transparent Attention

While current state-of-the-art NMT models, such as RNN seq2seq and Transformers, possess a large number of parameters, they are still shallow in comparison to convolutional models used for both text and vision applications. In this work we…

Computation and Language · Computer Science 2018-09-06 Ankur Bapna , Mia Xu Chen , Orhan Firat , Yuan Cao , Yonghui Wu