English

Phrase-Based Attentions

Computation and Language 2019-08-17 v1

Abstract

Most state-of-the-art neural machine translation systems, despite being different in architectural skeletons (e.g. recurrence, convolutional), share an indispensable feature: the Attention. However, most existing attention methods are token-based and ignore the importance of phrasal alignments, the key ingredient for the success of phrase-based statistical machine translation. In this paper, we propose novel phrase-based attention methods to model n-grams of tokens as attention entities. We incorporate our phrase-based attentions into the recently proposed Transformer network, and demonstrate that our approach yields improvements of 1.3 BLEU for English-to-German and 0.5 BLEU for German-to-English translation tasks on WMT newstest2014 using WMT'16 training data.

Keywords

Cite

@article{arxiv.1810.03444,
  title  = {Phrase-Based Attentions},
  author = {Phi Xuan Nguyen and Shafiq Joty},
  journal= {arXiv preprint arXiv:1810.03444},
  year   = {2019}
}

Comments

Under review as a conference paper at ICLR 2019