Related papers: Sample Efficient Text Summarization Using a Single…

Pre-trained Language Model Representations for Language Generation

Pre-trained language model representations have been successful in a wide range of language understanding tasks. In this paper, we examine different strategies to integrate pre-trained representations into sequence to sequence models and…

Computation and Language · Computer Science 2019-04-02 Sergey Edunov , Alexei Baevski , Michael Auli

Condenser: a Pre-training Architecture for Dense Retrieval

Pre-trained Transformer language models (LM) have become go-to text representation encoders. Prior research fine-tunes deep LMs to encode text sequences such as sentences and passages into single dense vector representations for efficient…

Computation and Language · Computer Science 2021-09-22 Luyu Gao , Jamie Callan

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization

This paper presents Z-Code++, a new pre-trained language model optimized for abstractive text summarization. The model extends the state of the art encoder-decoder model using three techniques. First, we use a two-phase pre-training process…

Computation and Language · Computer Science 2023-06-08 Pengcheng He , Baolin Peng , Liyang Lu , Song Wang , Jie Mei , Yang Liu , Ruochen Xu , Hany Hassan Awadalla , Yu Shi , Chenguang Zhu , Wayne Xiong , Michael Zeng , Jianfeng Gao , Xuedong Huang

Repurposing Decoder-Transformer Language Models for Abstractive Summarization

Neural network models have shown excellent fluency and performance when applied to abstractive summarization. Many approaches to neural abstractive summarization involve the introduction of significant inductive bias, exemplified through…

Computation and Language · Computer Science 2019-09-04 Luke de Oliveira , Alfredo Láinez Rodrigo

Abstractive Summarization with Combination of Pre-trained Sequence-to-Sequence and Saliency Models

Pre-trained sequence-to-sequence (seq-to-seq) models have significantly improved the accuracy of several language generation tasks, including abstractive summarization. Although the fluency of abstractive summarization has been greatly…

Computation and Language · Computer Science 2020-03-31 Itsumi Saito , Kyosuke Nishida , Kosuke Nishida , Junji Tomita

MeetSum: Transforming Meeting Transcript Summarization using Transformers!

Creating abstractive summaries from meeting transcripts has proven to be challenging due to the limited amount of labeled data available for training neural network models. Moreover, Transformer-based architectures have proven to beat…

Computation and Language · Computer Science 2021-08-16 Nima Sadri , Bohan Zhang , Bihan Liu

A Deep Reinforced Model for Abstractive Summarization

Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent…

Computation and Language · Computer Science 2017-11-15 Romain Paulus , Caiming Xiong , Richard Socher

Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

Large-scale learning of transformer language models has yielded improvements on a variety of natural language understanding tasks. Whether they can be effectively adapted for summarization, however, has been less explored, as the learned…

Computation and Language · Computer Science 2019-06-04 Andrew Hoang , Antoine Bosselut , Asli Celikyilmaz , Yejin Choi

Evaluating LLMs and Pre-trained Models for Text Summarization Across Diverse Datasets

Text summarization plays a crucial role in natural language processing by condensing large volumes of text into concise and coherent summaries. As digital content continues to grow rapidly and the demand for effective information retrieval…

Computation and Language · Computer Science 2025-03-14 Tohida Rehman , Soumabha Ghosh , Kuntal Das , Souvik Bhattacharjee , Debarshi Kumar Sanyal , Samiran Chattopadhyay

Survey on Abstractive Text Summarization: Dataset, Models, and Metrics

The advancements in deep learning, particularly the introduction of transformers, have been pivotal in enhancing various natural language processing (NLP) tasks. These include text-to-text applications such as machine translation, text…

Artificial Intelligence · Computer Science 2024-12-24 Gospel Ozioma Nnadi , Flavio Bertini

Long-Span Summarization via Local Attention and Content Selection

Transformer-based models have achieved state-of-the-art results in a wide range of natural language processing (NLP) tasks including document summarization. Typically these systems are trained by fine-tuning a large pre-trained model to the…

Computation and Language · Computer Science 2021-06-01 Potsawee Manakul , Mark J. F. Gales

Efficient Long-Text Understanding with Short-Text Models

Transformer-based pretrained language models (LMs) are ubiquitous across natural language understanding, but cannot be applied to long sequences such as stories, scientific articles and long documents, due to their quadratic complexity.…

Computation and Language · Computer Science 2022-12-29 Maor Ivgi , Uri Shaham , Jonathan Berant

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

In this paper, we show that a simple self-supervised pre-trained audio model can achieve comparable inference efficiency to more complicated pre-trained models with speech transformer encoders. These speech transformers rely on mixing…

Sound · Computer Science 2024-02-09 Sungho Jeon , Ching-Feng Yeh , Hakan Inan , Wei-Ning Hsu , Rashi Rungta , Yashar Mehdad , Daniel Bikel

Style Attuned Pre-training and Parameter Efficient Fine-tuning for Spoken Language Understanding

Neural models have yielded state-of-the-art results in deciphering spoken language understanding (SLU) problems; however, these models require a significant amount of domain-specific labeled examples for training, which is prohibitively…

Computation and Language · Computer Science 2020-10-12 Jin Cao , Jun Wang , Wael Hamza , Kelly Vanee , Shang-Wen Li

Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study

In this study, we delve into the efficacy of transformers within pre-trained language models (PLMs) when repurposed as encoders for Automatic Speech Recognition (ASR). Our underlying hypothesis posits that, despite being initially trained…

Audio and Speech Processing · Electrical Eng. & Systems 2024-09-27 Keyu An , Shiliang Zhang , Zhijie Yan

LIME: Making LLM Data More Efficient with Linguistic Metadata Embeddings

Pre-training decoder-only language models relies on vast amounts of high-quality data, yet the availability of such data is increasingly reaching its limits. While metadata is commonly used to create and curate these datasets, its potential…

Computation and Language · Computer Science 2025-12-09 Sebastian Sztwiertnia , Felix Friedrich , Kristian Kersting , Patrick Schramowski , Björn Deiseroth

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization,…

Computation and Language · Computer Science 2020-01-22 Itsumi Saito , Kyosuke Nishida , Kosuke Nishida , Atsushi Otsuka , Hisako Asano , Junji Tomita , Hiroyuki Shindo , Yuji Matsumoto

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

To capture the semantic graph structure from raw text, most existing summarization approaches are built on GNNs with a pre-trained model. However, these methods suffer from cumbersome procedures and inefficient computations for long-text…

Computation and Language · Computer Science 2021-10-22 Ye Liu , Jian-Guo Zhang , Yao Wan , Congying Xia , Lifang He , Philip S. Yu

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

End-to-end speech summarization (E2E SSum) directly summarizes input speech into easy-to-read short sentences with a single model. This approach is promising because it, in contrast to the conventional cascade approach, can utilize full…

Computation and Language · Computer Science 2023-06-08 Kohei Matsuura , Takanori Ashihara , Takafumi Moriya , Tomohiro Tanaka , Takatomo Kano , Atsunori Ogawa , Marc Delcroix

Should We Still Pretrain Encoders with Masked Language Modeling?

Learning high-quality text representations is fundamental to a wide range of NLP tasks. While encoder pretraining has traditionally relied on Masked Language Modeling (MLM), recent evidence suggests that decoder models pretrained with…

Computation and Language · Computer Science 2026-05-06 Hippolyte Gisserot-Boukhlef , Nicolas Boizard , Manuel Faysse , Duarte M. Alves , Emmanuel Malherbe , André F. T. Martins , Céline Hudelot , Pierre Colombo