English

A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization

Computation and Language 2023-11-02 v5

Abstract

Pre-trained language models (PLMs) have achieved outstanding achievements in abstractive single-document summarization (SDS). However, such benefits may not fully extend to multi-document summarization (MDS), where the handling of cross-document information is more complex. Previous works either design new MDS architectures or apply PLMs bluntly with concatenated source documents as a reformulated SDS task. While the former does not utilize previous pre-training efforts and may not generalize well across different domains, the latter may not sufficiently attend to the intricate cross-document relationships unique to MDS tasks. Instead, we enforce hierarchy on both the encoder and decoder to better utilize a PLM to facilitate multi-document interactions for the MDS task. Across 10 MDS benchmarks from various domains, our method outperforms or is competitive with the previous best models, including those with additional MDS pre-training or with more parameters. It outperforms its corresponding PLM backbone by up to 3 Rouge-L and is favored by humans.

Keywords

Cite

@article{arxiv.2305.08503,
  title  = {A Hierarchical Encoding-Decoding Scheme for Abstractive Multi-document Summarization},
  author = {Chenhui Shen and Liying Cheng and Xuan-Phi Nguyen and Yang You and Lidong Bing},
  journal= {arXiv preprint arXiv:2305.08503},
  year   = {2023}
}

Comments

16 pages, 3 figures