Related papers: Does Pretraining for Summarization Require Knowled…

Enhancing Biomedical Text Summarization and Question-Answering: On the Utility of Domain-Specific Pre-Training

Biomedical summarization requires large datasets to train for text generation. We show that while transfer learning offers a viable option for addressing this challenge, an in-domain pre-training does not always offer advantages in a BioASQ…

Computation and Language · Computer Science 2023-07-11 Dima Galat , Marian-Andrei Rizoiu

Synthetic Pre-Training Tasks for Neural Machine Translation

Pre-training models with large crawled corpora can lead to issues such as toxicity and bias, as well as copyright and privacy concerns. A promising way of alleviating such concerns is to conduct pre-training with synthetic tasks and data,…

Computation and Language · Computer Science 2023-06-01 Zexue He , Graeme Blackwood , Rameswar Panda , Julian McAuley , Rogerio Feris

Deep Transfer Reinforcement Learning for Text Summarization

Deep neural networks are data hungry models and thus face difficulties when attempting to train on small text datasets. Transfer learning is a potential solution but their effectiveness in the text domain is not as explored as in areas such…

Machine Learning · Computer Science 2019-01-28 Yaser Keneshloo , Naren Ramakrishnan , Chandan K. Reddy

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Neural abstractive summarization has been studied in many pieces of literature and achieves great success with the aid of large corpora. However, when encountering novel tasks, one may not always benefit from transfer learning due to the…

Computation and Language · Computer Science 2021-06-01 Yi-Syuan Chen , Hong-Han Shuai

Training Dynamics for Text Summarization Models

Pre-trained language models (e.g. BART) have shown impressive results when fine-tuned on large summarization datasets. However, little is understood about this fine-tuning process, including what knowledge is retained from pre-training time…

Computation and Language · Computer Science 2022-03-16 Tanya Goyal , Jiacheng Xu , Junyi Jessy Li , Greg Durrett

Compositional generalization in semantic parsing with pretrained transformers

Large-scale pretraining instills large amounts of knowledge in deep neural networks. This, in turn, improves the generalization behavior of these models in downstream tasks. What exactly are the limits to the generalization benefits of…

Computation and Language · Computer Science 2022-12-23 A. Emin Orhan

Knowledge Transfer via Pre-training for Recommendation: A Review and Prospect

Recommender systems aim to provide item recommendations for users, and are usually faced with data sparsity problem (e.g., cold start) in real-world scenarios. Recently pre-trained models have shown their effectiveness in knowledge transfer…

Information Retrieval · Computer Science 2020-09-22 Zheni Zeng , Chaojun Xiao , Yuan Yao , Ruobing Xie , Zhiyuan Liu , Fen Lin , Leyu Lin , Maosong Sun

Why pre-training is beneficial for downstream classification tasks?

Pre-training has exhibited notable benefits to downstream tasks by boosting accuracy and speeding up convergence, but the exact reasons for these benefits still remain unclear. To this end, we propose to quantitatively and explicitly…

Machine Learning · Computer Science 2024-10-14 Xin Jiang , Xu Cheng , Zechao Li

Downstream Datasets Make Surprisingly Good Pretraining Corpora

For most natural language processing tasks, the dominant practice is to finetune large pretrained transformer models (e.g., BERT) using smaller downstream datasets. Despite the success of this approach, it remains unclear to what extent…

Computation and Language · Computer Science 2023-05-29 Kundan Krishna , Saurabh Garg , Jeffrey P. Bigham , Zachary C. Lipton

Domain Pre-training Impact on Representations

This empirical study analyzes the effects of the pre-training corpus on the quality of learned transformer representations. We focus on the representation quality induced solely through pre-training. Our experiments show that pre-training…

Computation and Language · Computer Science 2025-06-02 Cesar Gonzalez-Gutierrez , Ariadna Quattoni

PreSumm: Predicting Summarization Performance Without Summarizing

Despite recent advancements in automatic summarization, state-of-the-art models do not summarize all documents equally well, raising the question: why? While prior research has extensively analyzed summarization models, little attention has…

Computation and Language · Computer Science 2025-04-09 Steven Koniaev , Ori Ernst , Jackie Chi Kit Cheung

Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep Character Recognition

Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models, and generally yields improved performance and faster training times. The technique of pre-training on one task and then…

Machine Learning · Computer Science 2020-01-03 Nishai Kooverjee , Steven James , Terence van Zyl

Beyond Repetition: Text Simplification and Curriculum Learning for Data-Constrained Pretraining

Most studies on language model pretraining focus on large datasets, leaving open questions about optimization in data-constrained settings. In such settings, the effects of training data order and of including alternative versions of the…

Computation and Language · Computer Science 2025-09-30 Matthew Theodore Roque , Dan John Velasco

Distraction-Based Neural Networks for Document Summarization

Distributed representation learned with neural networks has recently shown to be effective in modeling natural languages at fine granularities such as words, phrases, and even sentences. Whether and how such an approach can be extended to…

Computation and Language · Computer Science 2016-10-27 Qian Chen , Xiaodan Zhu , Zhenhua Ling , Si Wei , Hui Jiang

TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising

Text summarization aims to extract essential information from a piece of text and transform the text into a concise version. Existing unsupervised abstractive summarization models leverage recurrent neural networks framework while the…

Computation and Language · Computer Science 2020-10-20 Ziyi Yang , Chenguang Zhu , Robert Gmyr , Michael Zeng , Xuedong Huang , Eric Darve

Transformers Pretrained on Procedural Data Contain Modular Structures for Algorithmic Reasoning

Pretraining on large, semantically rich datasets is key for developing language models. Surprisingly, recent studies have shown that even synthetic data, generated procedurally through simple semantic-free algorithms, can yield some of the…

Machine Learning · Computer Science 2025-05-29 Zachary Shinnick , Liangze Jiang , Hemanth Saratchandran , Anton van den Hengel , Damien Teney

KLearn: Background Knowledge Inference from Summarization Data

The goal of text summarization is to compress documents to the relevant information while excluding background information already known to the receiver. So far, summarization researchers have given considerably more attention to relevance…

Computation and Language · Computer Science 2020-10-14 Maxime Peyrard , Robert West

Informed Pre-Training on Prior Knowledge

When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training…

Machine Learning · Computer Science 2022-05-24 Laura von Rueden , Sebastian Houben , Kostadin Cvejoski , Christian Bauckhage , Nico Piatkowski

Leveraging Visual Knowledge in Language Tasks: An Empirical Study on Intermediate Pre-training for Cross-modal Knowledge Transfer

Pre-trained language models are still far from human performance in tasks that need understanding of properties (e.g. appearance, measurable quantity) and affordances of everyday objects in the real world since the text lacks such…

Computation and Language · Computer Science 2022-03-18 Woojeong Jin , Dong-Ho Lee , Chenguang Zhu , Jay Pujara , Xiang Ren

Large Language Models and Book Summarization: Reading or Remembering, Which Is Better?

Summarization is a core task in Natural Language Processing (NLP). Recent advances in Large Language Models (LLMs) and the introduction of large context windows reaching millions of tokens make it possible to process entire books in a…

Computation and Language · Computer Science 2026-03-12 Tairan Fu , Javier Conde , Pedro Reviriego , Javier Coronado-Blázquez , Nina Melero , Elena Merino-Gómez