Related papers: Parameter-efficient Multi-task Fine-tuning for Tra…

Parameter-Efficient Transfer Learning for NLP

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we…

Machine Learning · Computer Science 2019-06-14 Neil Houlsby , Andrei Giurgiu , Stanislaw Jastrzebski , Bruna Morrone , Quentin de Laroussilhe , Andrea Gesmundo , Mona Attariyan , Sylvain Gelly

Parameter-Efficient Multi-Task Learning via Progressive Task-Specific Adaptation

Parameter-efficient fine-tuning methods have emerged as a promising solution for adapting pre-trained models to various downstream tasks. While these methods perform well in single-task learning, extending them to multi-task learning…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Neeraj Gangwar , Anshuka Rangi , Rishabh Deshmukh , Holakou Rahmanian , Yesh Dattatreya , Nickvash Kani

HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks

The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient…

Computation and Language · Computer Science 2022-03-09 Zhengkun Zhang , Wenya Guo , Xiaojun Meng , Yasheng Wang , Yadao Wang , Xin Jiang , Qun Liu , Zhenglu Yang

Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning

Efficient finetuning of pretrained language transformers is becoming increasingly prevalent for solving natural language processing tasks. While effective, it can still require a large number of tunable parameters. This can be a drawback…

Computation and Language · Computer Science 2023-05-31 Umang Gupta , Aram Galstyan , Greg Ver Steeg

K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning

We introduce a novel method that enables parameter-efficient transfer and multi-task learning with deep neural networks. The basic approach is to learn a model patch - a small set of parameters - that will specialize to each task, instead…

Machine Learning · Computer Science 2019-02-26 Pramod Kaushik Mudrakarta , Mark Sandler , Andrey Zhmoginov , Andrew Howard

Parameter Efficient Transfer Learning for Various Speech Processing Tasks

Fine-tuning of self-supervised models is a powerful transfer learning method in a variety of fields, including speech processing, since it can utilize generic feature representations obtained from large amounts of unlabeled data.…

Multimedia · Computer Science 2022-12-07 Shinta Otake , Rei Kawakami , Nakamasa Inoue

Parameter-Efficient Transfer Learning with Diff Pruning

While task-specific finetuning of pretrained networks has led to significant empirical advances in NLP, the large size of networks makes finetuning difficult to deploy in multi-task, memory-constrained settings. We propose diff pruning as a…

Computation and Language · Computer Science 2021-06-10 Demi Guo , Alexander M. Rush , Yoon Kim

Understanding Parameter Sharing in Transformers

Parameter sharing has proven to be a parameter-efficient approach. Previous work on Transformers has focused on sharing parameters in different layers, which can improve the performance of models with limited parameters by increasing model…

Machine Learning · Computer Science 2023-06-19 Ye Lin , Mingxuan Wang , Zhexi Zhang , Xiaohui Wang , Tong Xiao , Jingbo Zhu

Towards a Unified View of Parameter-Efficient Transfer Learning

Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP. However, conventional approaches fine-tune all the parameters of the pre-trained model, which becomes prohibitive as the…

Computation and Language · Computer Science 2022-02-03 Junxian He , Chunting Zhou , Xuezhe Ma , Taylor Berg-Kirkpatrick , Graham Neubig

An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning

Parameter-efficient fine-tuning approaches have recently garnered a lot of attention. Having considerably lower number of trainable weights, these methods can bring about scalability and computational effectiveness. In this paper, we look…

Computation and Language · Computer Science 2023-02-23 Mohammad Akbar-Tajari , Sara Rajaee , Mohammad Taher Pilehvar

Enhanced Transfer Learning with ImageNet Trained Classification Layer

Parameter fine tuning is a transfer learning approach whereby learned parameters from pre-trained source network are transferred to the target network followed by fine-tuning. Prior research has shown that this approach is capable of…

Computer Vision and Pattern Recognition · Computer Science 2019-09-20 Tasfia Shermin , Shyh Wei Teng , Manzur Murshed , Guojun Lu , Ferdous Sohel , Manoranjan Paul

HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into Multi-Task Transformers for Sequence Labelling

We present HyperLoader, a simple approach that combines different parameter-efficient fine-tuning methods in a multi-task setting. To achieve this goal, our model uses a hypernetwork to generate the weights of these modules based on the…

Computation and Language · Computer Science 2024-08-27 Jesus-German Ortiz-Barajas , Helena Gomez-Adorno , Thamar Solorio

Optimizing Specific and Shared Parameters for Efficient Parameter Tuning

Foundation models, with a vast number of parameters and pretraining on massive datasets, achieve state-of-the-art performance across various applications. However, efficiently adapting them to downstream tasks with minimal computational…

Machine Learning · Computer Science 2025-04-07 Van-Anh Nguyen , Thanh-Toan Do , Mehrtash Harandi , Dinh Phung , Trung Le

Subformer: Exploring Weight Sharing for Parameter Efficiency in Generative Transformers

Transformers have shown improved performance when compared to previous architectures for sequence processing such as RNNs. Despite their sizeable performance gains, as recently suggested, the model is computationally expensive to train and…

Computation and Language · Computer Science 2021-09-09 Machel Reid , Edison Marrese-Taylor , Yutaka Matsuo

HyperPrompt: Prompt-based Task-Conditioning of Transformers

Prompt-Tuning is a new paradigm for finetuning pre-trained language models in a parameter-efficient way. Here, we explore the use of HyperNetworks to generate hyper-prompts: we propose HyperPrompt, a novel architecture for prompt-based…

Computation and Language · Computer Science 2022-06-16 Yun He , Huaixiu Steven Zheng , Yi Tay , Jai Gupta , Yu Du , Vamsi Aribandi , Zhe Zhao , YaGuang Li , Zhao Chen , Donald Metzler , Heng-Tze Cheng , Ed H. Chi

VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks

Recently, fine-tuning language models pre-trained on large text corpora have provided huge improvements on vision-and-language (V&L) tasks as well as on pure language tasks. However, fine-tuning the entire parameter set of pre-trained…

Computer Vision and Pattern Recognition · Computer Science 2022-03-25 Yi-Lin Sung , Jaemin Cho , Mohit Bansal

Parameter Sharing Methods for Multilingual Self-Attentional Translation Models

In multilingual neural machine translation, it has been shown that sharing a single translation model between multiple languages can achieve competitive performance, sometimes even leading to performance gains over bilingually trained…

Computation and Language · Computer Science 2018-09-14 Devendra Singh Sachan , Graham Neubig

Parameter-Efficient Fine-Tuning With Adapters

In the arena of language model fine-tuning, the traditional approaches, such as Domain-Adaptive Pretraining (DAPT) and Task-Adaptive Pretraining (TAPT), although effective, but computational intensive. This research introduces a novel…

Computation and Language · Computer Science 2024-05-10 Keyu Chen , Yuan Pang , Zi Yang

Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning

Fine-tuning pre-trained generative language models to down-stream language generation tasks has shown promising results. However, this comes with the cost of having a single, large model for each task, which is not ideal in low-memory/power…

Computation and Language · Computer Science 2020-09-22 Zhaojiang Lin , Andrea Madotto , Pascale Fung

Multilingual Machine Translation with Hyper-Adapters

Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to…

Computation and Language · Computer Science 2022-12-06 Christos Baziotis , Mikel Artetxe , James Cross , Shruti Bhosale