Large Margin Neural Language Model

Jiaji Huang; Yi Li; Wei Ping; Liang Huang

Large Margin Neural Language Model

Computation and Language 2018-08-29 v1

Authors: Jiaji Huang , Yi Li , Wei Ping , Liang Huang

Abstract

We propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the "good" and "bad" sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.

Keywords

large language model training language modeling instruction tuning

Cite

@article{arxiv.1808.08987,
  title  = {Large Margin Neural Language Model},
  author = {Jiaji Huang and Yi Li and Wei Ping and Liang Huang},
  journal= {arXiv preprint arXiv:1808.08987},
  year   = {2018}
}

Comments

9 pages. Accepted as a long paper in EMNLP2018

Large Margin Neural Language Model

Abstract

Keywords

Cite

Comments

Related papers