English

MAPGN: MAsked Pointer-Generator Network for sequence-to-sequence pre-training

Computation and Language 2021-02-17 v2

Abstract

This paper presents a self-supervised learning method for pointer-generator networks to improve spoken-text normalization. Spoken-text normalization that converts spoken-style text into style normalized text is becoming an important technology for improving subsequent processing such as machine translation and summarization. The most successful spoken-text normalization method to date is sequence-to-sequence (seq2seq) mapping using pointer-generator networks that possess a copy mechanism from an input sequence. However, these models require a large amount of paired data of spoken-style text and style normalized text, and it is difficult to prepare such a volume of data. In order to construct spoken-text normalization model from the limited paired data, we focus on self-supervised learning which can utilize unpaired text data to improve seq2seq models. Unfortunately, conventional self-supervised learning methods do not assume that pointer-generator networks are utilized. Therefore, we propose a novel self-supervised learning method, MAsked Pointer-Generator Network (MAPGN). The proposed method can effectively pre-train the pointer-generator network by learning to fill masked tokens using the copy mechanism. Our experiments demonstrate that MAPGN is more effective for pointer-generator networks than the conventional self-supervised learning methods in two spoken-text normalization tasks.

Keywords

Cite

@article{arxiv.2102.07380,
  title  = {MAPGN: MAsked Pointer-Generator Network for sequence-to-sequence pre-training},
  author = {Mana Ihori and Naoki Makishima and Tomohiro Tanaka and Akihiko Takashima and Shota Orihashi and Ryo Masumura},
  journal= {arXiv preprint arXiv:2102.07380},
  year   = {2021}
}

Comments

Accepted at ICASSP 2021

R2 v1 2026-06-23T23:09:33.187Z