English

Learning Comment Generation by Leveraging User-Generated Data

Computation and Language 2019-02-28 v2

Abstract

Existing models on open-domain comment generation are difficult to train, and they produce repetitive and uninteresting responses. The problem is due to multiple and contradictory responses from a single article, and by the rigidity of retrieval methods. To solve this problem, we propose a combined approach to retrieval and generation methods. We propose an attentive scorer to retrieve informative and relevant comments by leveraging user-generated data. Then, we use such comments, together with the article, as input for a sequence-to-sequence model with copy mechanism. We show the robustness of our model and how it can alleviate the aforementioned issue by using a large scale comment generation dataset. The result shows that the proposed generative model significantly outperforms strong baseline such as Seq2Seq with attention and Information Retrieval models by around 27 and 30 BLEU-1 points respectively.

Keywords

Cite

@article{arxiv.1810.12264,
  title  = {Learning Comment Generation by Leveraging User-Generated Data},
  author = {Zhaojiang Lin and Genta Indra Winata and Pascale Fung},
  journal= {arXiv preprint arXiv:1810.12264},
  year   = {2019}
}

Comments

Accepted by ICASSP 2019

R2 v1 2026-06-23T04:56:22.181Z