English

Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model

Computation and Language 2016-05-13 v1 Machine Learning Machine Learning

Abstract

Recent advances in conditional recurrent language modelling have mainly focused on network architectures (e.g., attention mechanism), learning algorithms (e.g., scheduled sampling and sequence-level training) and novel applications (e.g., image/video description generation, speech recognition, etc.) On the other hand, we notice that decoding algorithms/strategies have not been investigated as much, and it has become standard to use greedy or beam search. In this paper, we propose a novel decoding strategy motivated by an earlier observation that nonlinear hidden layers of a deep neural network stretch the data manifold. The proposed strategy is embarrassingly parallelizable without any communication overhead, while improving an existing decoding algorithm. We extensively evaluate it with attention-based neural machine translation on the task of En->Cz translation.

Keywords

Cite

@article{arxiv.1605.03835,
  title  = {Noisy Parallel Approximate Decoding for Conditional Recurrent Language Model},
  author = {Kyunghyun Cho},
  journal= {arXiv preprint arXiv:1605.03835},
  year   = {2016}
}
R2 v1 2026-06-22T13:59:26.836Z