Related papers: What to Pre-Train on? Efficient Intermediate Task …

Efficiently Tuned Parameters are Task Embeddings

Intermediate-task transfer can benefit a wide range of NLP tasks with properly selected source datasets. However, it is computationally infeasible to experiment with all intermediate transfer combinations, making choosing a useful source…

Computation and Language · Computer Science 2022-10-24 Wangchunshu Zhou , Canwen Xu , Julian McAuley

Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning

Intermediate task transfer learning can greatly improve model performance. If, for example, one has little training data for emotion detection, first fine-tuning a language model on a sentiment classification dataset may improve performance…

Computation and Language · Computer Science 2024-10-22 David Schulte , Felix Hamborg , Alan Akbik

Does QA-based intermediate training help fine-tuning language models for text classification?

Fine-tuning pre-trained language models for downstream tasks has become a norm for NLP. Recently it is found that intermediate training based on high-level inference tasks such as Question Answering (QA) can improve the performance of some…

Computation and Language · Computer Science 2022-01-03 Shiwei Zhang , Xiuzhen Zhang

The (In)Effectiveness of Intermediate Task Training For Domain Adaptation and Cross-Lingual Transfer Learning

Transfer learning from large language models (LLMs) has emerged as a powerful technique to enable knowledge-based fine-tuning for a number of tasks, adaptation of models for different domains and even languages. However, it remains an open…

Computation and Language · Computer Science 2022-11-08 Sovesh Mohapatra , Somesh Mohapatra

Exploring the Effectiveness and Consistency of Task Selection in Intermediate-Task Transfer Learning

Identifying beneficial tasks to transfer from is a critical step toward successful intermediate-task transfer learning. In this work, we experiment with 130 source-target task combinations and demonstrate that the transfer performance…

Computation and Language · Computer Science 2024-07-24 Pin-Jie Lin , Miaoran Zhang , Marius Mosbach , Dietrich Klakow

Intermediate-Task Transfer Learning with Pretrained Models for Natural Language Understanding: When and Why Does It Work?

While pretrained models such as BERT have shown large gains across natural language understanding tasks, their performance can be improved by further training the model on a data-rich intermediate task, before fine-tuning it on a target…

Computation and Language · Computer Science 2020-05-12 Yada Pruksachatkun , Jason Phang , Haokun Liu , Phu Mon Htut , Xiaoyi Zhang , Richard Yuanzhe Pang , Clara Vania , Katharina Kann , Samuel R. Bowman

Exploring and Predicting Transferability across NLP Tasks

Recent advances in NLP demonstrate the effectiveness of training large-scale language models and transferring them to downstream tasks. Can fine-tuning these models on tasks other than language modeling further improve performance? In this…

Computation and Language · Computer Science 2020-10-08 Tu Vu , Tong Wang , Tsendsuren Munkhdalai , Alessandro Sordoni , Adam Trischler , Andrew Mattarella-Micke , Subhransu Maji , Mohit Iyyer

FineText: Text Classification via Attention-based Language Model Fine-tuning

Training deep neural networks from scratch on natural language processing (NLP) tasks requires significant amount of manually labeled text corpus and substantial time to converge, which usually cannot be satisfied by the customers. In this…

Computation and Language · Computer Science 2019-10-29 Yunzhe Tao , Saurabh Gupta , Satyapriya Krishna , Xiong Zhou , Orchid Majumder , Vineet Khare

Efficient Neural Task Adaptation by Maximum Entropy Initialization

Transferring knowledge from one neural network to another has been shown to be helpful for learning tasks with few training examples. Prevailing fine-tuning methods could potentially contaminate pre-trained features by comparably high…

Machine Learning · Computer Science 2019-07-15 Farshid Varno , Behrouz Haji Soleimani , Marzie Saghayi , Lisa Di Jorio , Stan Matwin

Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5

We compare sequential fine-tuning with a model for multi-task learning in the context where we are interested in boosting performance on two tasks, one of which depends on the other. We test these models on the FigLang2022 shared task which…

Computation and Language · Computer Science 2022-11-01 Irina Bigoulaeva , Rachneet Sachdeva , Harish Tayyar Madabushi , Aline Villavicencio , Iryna Gurevych

Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks. However,…

Computation and Language · Computer Science 2023-03-07 Zhen Wang , Rameswar Panda , Leonid Karlinsky , Rogerio Feris , Huan Sun , Yoon Kim

English Intermediate-Task Training Improves Zero-Shot Cross-Lingual Transfer Too

Intermediate-task training---fine-tuning a pretrained model on an intermediate task before fine-tuning again on the target task---often improves model performance substantially on language understanding tasks in monolingual English…

Computation and Language · Computer Science 2020-10-02 Jason Phang , Iacer Calixto , Phu Mon Htut , Yada Pruksachatkun , Haokun Liu , Clara Vania , Katharina Kann , Samuel R. Bowman

Parameter-Efficient Transfer Learning for NLP

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we…

Machine Learning · Computer Science 2019-06-14 Neil Houlsby , Andrei Giurgiu , Stanislaw Jastrzebski , Bruna Morrone , Quentin de Laroussilhe , Andrea Gesmundo , Mona Attariyan , Sylvain Gelly

Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks

Pre-trained word embeddings are the primary method for transfer learning in several Natural Language Processing (NLP) tasks. Recent works have focused on using unsupervised techniques such as language modeling to obtain these embeddings. In…

Computation and Language · Computer Science 2019-07-01 Mihir Kale , Aditya Siddhant , Sreyashi Nag , Radhika Parik , Matthias Grabmair , Anthony Tomasic

Muppet: Massive Multi-task Representations with Pre-Finetuning

We propose pre-finetuning, an additional large-scale learning stage between language model pre-training and fine-tuning. Pre-finetuning is massively multi-task learning (around 50 datasets, over 4.8 million total labeled examples), and is…

Computation and Language · Computer Science 2021-01-28 Armen Aghajanyan , Anchit Gupta , Akshat Shrivastava , Xilun Chen , Luke Zettlemoyer , Sonal Gupta

Efficient Conditional Pre-training for Transfer Learning

Almost all the state-of-the-art neural networks for computer vision tasks are trained by (1) pre-training on a large-scale dataset and (2) finetuning on the target dataset. This strategy helps reduce dependence on the target dataset and…

Computer Vision and Pattern Recognition · Computer Science 2021-11-22 Shuvam Chakraborty , Burak Uzkent , Kumar Ayush , Kumar Tanmay , Evan Sheehan , Stefano Ermon

Scalable Fine-tuning from Multiple Data Sources: A First-Order Approximation Approach

We study the problem of fine-tuning a language model (LM) for a target task by optimally using the information from $n$ auxiliary tasks. This problem has broad applications in NLP, such as targeted instruction tuning and data selection in…

Computation and Language · Computer Science 2025-06-03 Dongyue Li , Ziniu Zhang , Lu Wang , Hongyang R. Zhang

When to Use Multi-Task Learning vs Intermediate Fine-Tuning for Pre-Trained Encoder Transfer Learning

Transfer learning (TL) in natural language processing (NLP) has seen a surge of interest in recent years, as pre-trained models have shown an impressive ability to transfer to novel tasks. Three main strategies have emerged for making use…

Computation and Language · Computer Science 2022-05-18 Orion Weller , Kevin Seppi , Matt Gardner

Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner

Language models (LMs) trained on vast quantities of unlabelled data have greatly advanced the field of natural language processing (NLP). In this study, we re-visit the widely accepted notion in NLP that continued pre-training LMs on…

Computation and Language · Computer Science 2023-10-09 Zhengxiang Shi , Aldo Lipani

ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification

Practical sequence classification tasks in natural language processing often suffer from low training data availability for target classes. Recent works towards mitigating this problem have focused on transfer learning using embeddings…

Computation and Language · Computer Science 2021-01-29 Manoj Kumar , Varun Kumar , Hadrien Glaude , Cyprien delichy , Aman Alok , Rahul Gupta