SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Hongyi Yuan; Zheng Yuan; Chuanqi Tan; Fei Huang; Songfang Huang

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Computation and Language 2023-05-23 v5

Authors: Hongyi Yuan , Zheng Yuan , Chuanqi Tan , Fei Huang , Songfang Huang

Abstract

Diffusion model, a new generative modelling paradigm, has achieved great success in image, audio, and video generation. However, considering the discrete categorical nature of text, it is not trivial to extend continuous diffusion models to natural language, and text diffusion models are less studied. Sequence-to-sequence text generation is one of the essential natural language processing topics. In this work, we apply diffusion models to approach sequence-to-sequence text generation, and explore whether the superiority generation performance of diffusion model can transfer to natural language domain. We propose SeqDiffuSeq, a text diffusion model for sequence-to-sequence generation. SeqDiffuSeq uses an encoder-decoder Transformers architecture to model denoising function. In order to improve generation quality, SeqDiffuSeq combines the self-conditioning technique and a newly proposed adaptive noise schedule technique. The adaptive noise schedule has the difficulty of denoising evenly distributed across time steps, and considers exclusive noise schedules for tokens at different positional order. Experiment results illustrate the good performance on sequence-to-sequence generation in terms of text quality and inference time.

Keywords

diffusion model audio generation model transformation

Cite

@article{arxiv.2212.10325,
  title  = {SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers},
  author = {Hongyi Yuan and Zheng Yuan and Chuanqi Tan and Fei Huang and Songfang Huang},
  journal= {arXiv preprint arXiv:2212.10325},
  year   = {2023}
}

Comments

Under Review

SeqDiffuSeq: Text Diffusion with Encoder-Decoder Transformers

Abstract

Keywords

Cite

Comments

Related papers