Related papers: Compacter: Efficient Low-Rank Hypercomplex Adapter…

Jointly Reparametrized Multi-Layer Adaptation for Efficient and Private Tuning

Efficient finetuning of pretrained language transformers is becoming increasingly prevalent for solving natural language processing tasks. While effective, it can still require a large number of tunable parameters. This can be a drawback…

Computation and Language · Computer Science 2023-05-31 Umang Gupta , Aram Galstyan , Greg Ver Steeg

KronA: Parameter Efficient Tuning with Kronecker Adapter

Fine-tuning a Pre-trained Language Model (PLM) on a specific downstream task has been a well-known paradigm in Natural Language Processing. However, with the ever-growing size of PLMs, training the entire model on several downstream tasks…

Computation and Language · Computer Science 2022-12-22 Ali Edalati , Marzieh Tahaei , Ivan Kobyzev , Vahid Partovi Nia , James J. Clark , Mehdi Rezagholizadeh

Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios. In this work, we present a novel method that operates on the hidden representations of a PLM to…

Computation and Language · Computer Science 2023-05-29 Linlin Liu , Xingxuan Li , Megh Thakkar , Xin Li , Shafiq Joty , Luo Si , Lidong Bing

Robust Transfer Learning with Pretrained Language Models through Adapters

Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific…

Computation and Language · Computer Science 2021-08-06 Wenjuan Han , Bo Pang , Yingnian Wu

Polyhistor: Parameter-Efficient Multi-Task Adaptation for Dense Vision Tasks

Adapting large-scale pretrained models to various downstream tasks via fine-tuning is a standard method in machine learning. Recently, parameter-efficient fine-tuning methods show promise in adapting a pretrained model to different tasks…

Computer Vision and Pattern Recognition · Computer Science 2022-10-10 Yen-Cheng Liu , Chih-Yao Ma , Junjiao Tian , Zijian He , Zsolt Kira

Parameter-Efficient Transfer Learning for NLP

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we…

Machine Learning · Computer Science 2019-06-14 Neil Houlsby , Andrei Giurgiu , Stanislaw Jastrzebski , Bruna Morrone , Quentin de Laroussilhe , Andrea Gesmundo , Mona Attariyan , Sylvain Gelly

Consolidator: Mergeable Adapter with Grouped Connections for Visual Adaptation

Recently, transformers have shown strong ability as visual feature extractors, surpassing traditional convolution-based models in various scenarios. However, the success of vision transformers largely owes to their capacity to accommodate…

Computer Vision and Pattern Recognition · Computer Science 2023-05-02 Tianxiang Hao , Hui Chen , Yuchen Guo , Guiguang Ding

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a…

Computation and Language · Computer Science 2021-06-08 Ruidan He , Linlin Liu , Hai Ye , Qingyu Tan , Bosheng Ding , Liying Cheng , Jia-Wei Low , Lidong Bing , Luo Si

TransformerRanker: A Tool for Efficiently Finding the Best-Suited Language Models for Downstream Classification Tasks

Classification tasks in NLP are typically addressed by selecting a pre-trained language model (PLM) from a model hub, and fine-tuning it for the task at hand. However, given the very large number of PLMs that are currently available, a…

Computation and Language · Computer Science 2024-09-11 Lukas Garbas , Max Ploner , Alan Akbik

AdapterHub: A Framework for Adapting Transformers

The current modus operandi in NLP involves downloading and fine-tuning pre-trained models consisting of millions or billions of parameters. Storing and sharing such large trained models is expensive, slow, and time-consuming, which impedes…

Computation and Language · Computer Science 2020-10-07 Jonas Pfeiffer , Andreas Rücklé , Clifton Poth , Aishwarya Kamath , Ivan Vulić , Sebastian Ruder , Kyunghyun Cho , Iryna Gurevych

HyperLoader: Integrating Hypernetwork-Based LoRA and Adapter Layers into Multi-Task Transformers for Sequence Labelling

We present HyperLoader, a simple approach that combines different parameter-efficient fine-tuning methods in a multi-task setting. To achieve this goal, our model uses a hypernetwork to generate the weights of these modules based on the…

Computation and Language · Computer Science 2024-08-27 Jesus-German Ortiz-Barajas , Helena Gomez-Adorno , Thamar Solorio

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years. While the results from these models are impressive, applying them can be extremely computationally expensive, as is…

Computation and Language · Computer Science 2020-08-18 Davis Yoshida , Allyson Ettinger , Kevin Gimpel

HyperAdapt: Simple High-Rank Adaptation

Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by…

Machine Learning · Computer Science 2026-04-24 Abel Gurung , Joseph Campbell

FineGates: LLMs Finetuning with Compression using Stochastic Gates

Large Language Models (LLMs), with billions of parameters, present significant challenges for full finetuning due to the high computational demands, memory requirements, and impracticality of many real-world applications. When faced with…

Machine Learning · Computer Science 2024-12-18 Jonathan Svirsky , Yehonathan Refael , Ofir Lindenbaum

The LLM Surgeon

State-of-the-art language models are becoming increasingly large in an effort to achieve the highest performance on large corpora of available textual data. However, the sheer size of the Transformer architectures makes it difficult to…

Machine Learning · Computer Science 2024-03-22 Tycho F. A. van der Ouderaa , Markus Nagel , Mart van Baalen , Yuki M. Asano , Tijmen Blankevoort

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

The rapid expansion of Large Language Models (LLMs) has posed significant challenges regarding the computational resources required for fine-tuning and deployment. Recent advancements in low-rank adapters have demonstrated their efficacy in…

Machine Learning · Computer Science 2025-01-29 J. Pablo Muñoz , Jinjie Yuan , Nilesh Jain

Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models

Recent transformer language models achieve outstanding results in many natural language processing (NLP) tasks. However, their enormous size often makes them impractical on memory-constrained devices, requiring practitioners to compress…

Computation and Language · Computer Science 2023-02-09 Mohammadreza Banaei , Klaudia Bałazy , Artur Kasymov , Rémi Lebret , Jacek Tabor , Karl Aberer

Kron-LoRA: Hybrid Kronecker-LoRA Adapters for Scalable, Sustainable Fine-tuning

Fine-tuning massive pre-trained language models across many tasks demands adapters that are both parameter-efficient and expressive. We introduce \textbf{Kron-LoRA}, a hybrid adapter that combines Kronecker-structured factorization with…

Machine Learning · Computer Science 2025-09-25 Yixin Shen

MoKA: Mixture of Kronecker Adapters

Parameter-efficient fine-tuning (PEFT) is essential for reducing the computational overhead of large language models (LLMs). Low-rank family adapters are commonly used to control the parameter size efficiently while maintaining the…

Machine Learning · Computer Science 2025-08-06 Mohammadreza Sadeghi , Mahsa Ghazvini Nejad , MirHamed Jafarzadeh Asl , Yu Gu , Yuanhao Yu , Masoud Asgharian , Vahid Partovi Nia