Related papers: Adapters Strike Back

Adaptable Adapters

State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of…

Computation and Language · Computer Science 2022-05-04 Nafise Sadat Moosavi , Quentin Delfosse , Kristian Kersting , Iryna Gurevych

AdapterBias: Parameter-efficient Token-dependent Representation Shift for Adapters in NLP Tasks

Transformer-based pre-trained models with millions of parameters require large storage. Recent approaches tackle this shortcoming by training adapters, but these approaches still require a relatively large number of parameters. In this…

Computation and Language · Computer Science 2023-01-31 Chin-Lun Fu , Zih-Ching Chen , Yun-Ru Lee , Hung-yi Lee

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition

Pretraining Vision Transformers (ViTs) has achieved great success in visual recognition. A following scenario is to adapt a ViT to various image and video recognition tasks. The adaptation is challenging because of heavy computation and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Shoufa Chen , Chongjian Ge , Zhan Tong , Jiangliu Wang , Yibing Song , Jue Wang , Ping Luo

Looped Transformers are Better at Learning Learning Algorithms

Transformers have demonstrated effectiveness in in-context solving data-fitting problems from various (latent) models, as reported by Garg et al. However, the absence of an inherent iterative structure in the transformer architecture…

Machine Learning · Computer Science 2024-03-19 Liu Yang , Kangwook Lee , Robert Nowak , Dimitris Papailiopoulos

AdapterDrop: On the Efficiency of Adapters in Transformers

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements. Recent approaches tackle these shortcomings by training smaller models, dynamically reducing the…

Machine Learning · Computer Science 2021-10-07 Andreas Rücklé , Gregor Geigle , Max Glockner , Tilman Beck , Jonas Pfeiffer , Nils Reimers , Iryna Gurevych

Parameter Efficient Transfer Learning for Various Speech Processing Tasks

Fine-tuning of self-supervised models is a powerful transfer learning method in a variety of fields, including speech processing, since it can utilize generic feature representations obtained from large amounts of unlabeled data.…

Multimedia · Computer Science 2022-12-07 Shinta Otake , Rei Kawakami , Nakamasa Inoue

A Comprehensive Analysis of Adapter Efficiency

Adapters have been positioned as a parameter-efficient fine-tuning (PEFT) approach, whereby a minimal number of parameters are added to the model and fine-tuned. However, adapters have not been sufficiently analyzed to understand if PEFT…

Computation and Language · Computer Science 2023-05-15 Nandini Mundra , Sumanth Doddapaneni , Raj Dabre , Anoop Kunchukuttan , Ratish Puduppully , Mitesh M. Khapra

Hierarchical Recurrent Adapters for Efficient Multi-Task Adaptation of Large Speech Models

Parameter efficient adaptation methods have become a key mechanism to train large pre-trained models for downstream tasks. However, their per-task parameter overhead is considered still high when the number of downstream tasks to adapt for…

Audio and Speech Processing · Electrical Eng. & Systems 2024-04-01 Tsendsuren Munkhdalai , Youzheng Chen , Khe Chai Sim , Fadi Biadsy , Tara Sainath , Pedro Moreno Mengibar

Time-, Memory- and Parameter-Efficient Visual Adaptation

As foundation models become more popular, there is a growing need to efficiently finetune them for downstream tasks. Although numerous adaptation methods have been proposed, they are designed to be efficient only in terms of how many…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Otniel-Bogdan Mercea , Alexey Gritsenko , Cordelia Schmid , Anurag Arnab

Towards Optimal Adapter Placement for Efficient Transfer Learning

Parameter-efficient transfer learning (PETL) aims to adapt pre-trained models to new downstream tasks while minimizing the number of fine-tuned parameters. Adapters, a popular approach in PETL, inject additional capacity into existing…

Machine Learning · Computer Science 2024-10-22 Aleksandra I. Nowak , Otniel-Bogdan Mercea , Anurag Arnab , Jonas Pfeiffer , Yann Dauphin , Utku Evci

Robust Transfer Learning with Pretrained Language Models through Adapters

Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific…

Computation and Language · Computer Science 2021-08-06 Wenjuan Han , Bo Pang , Yingnian Wu

Adaptivity and Modularity for Efficient Generalization Over Task Complexity

Can transformers generalize efficiently on problems that require dealing with examples with different levels of difficulty? We introduce a new task tailored to assess generalization over different complexities and present results that…

Machine Learning · Computer Science 2023-10-16 Samira Abnar , Omid Saremi , Laurent Dinh , Shantel Wilson , Miguel Angel Bautista , Chen Huang , Vimal Thilak , Etai Littwin , Jiatao Gu , Josh Susskind , Samy Bengio

Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

We introduce Adapters, an open-source library that unifies parameter-efficient and modular transfer learning in large language models. By integrating 10 diverse adapter methods into a unified interface, Adapters offers ease of use and…

Computation and Language · Computer Science 2023-11-21 Clifton Poth , Hannah Sterz , Indraneil Paul , Sukannya Purkayastha , Leon Engländer , Timo Imhof , Ivan Vulić , Sebastian Ruder , Iryna Gurevych , Jonas Pfeiffer

Multilingual Machine Translation with Hyper-Adapters

Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to…

Computation and Language · Computer Science 2022-12-06 Christos Baziotis , Mikel Artetxe , James Cross , Shruti Bhosale

Mini but Mighty: Finetuning ViTs with Mini Adapters

Vision Transformers (ViTs) have become one of the dominant architectures in computer vision, and pre-trained ViT models are commonly adapted to new tasks via fine-tuning. Recent works proposed several parameter-efficient transfer learning…

Computer Vision and Pattern Recognition · Computer Science 2023-11-08 Imad Eddine Marouf , Enzo Tartaglione , Stéphane Lathuilière

Comparative Analysis of Efficient Adapter-Based Fine-Tuning of State-of-the-Art Transformer Models

In this work, we investigate the efficacy of various adapter architectures on supervised binary classification tasks from the SuperGLUE benchmark as well as a supervised multi-class news category classification task from Kaggle.…

Computation and Language · Computer Science 2025-01-15 Saad Mashkoor Siddiqui , Mohammad Ali Sheikh , Muhammad Aleem , Kajol R Singh

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a…

Computation and Language · Computer Science 2021-06-08 Ruidan He , Linlin Liu , Hai Ye , Qingyu Tan , Bosheng Ding , Liying Cheng , Jia-Wei Low , Lidong Bing , Luo Si

Structure-Learnable Adapter Fine-Tuning for Parameter-Efficient Large Language Models

This paper addresses the issues of parameter redundancy, rigid structure, and limited task adaptability in the fine-tuning of large language models. It proposes an adapter-based fine-tuning method built on a structure-learnable mechanism.…

Computation and Language · Computer Science 2025-09-04 Ming Gong , Yingnan Deng , Nia Qi , Yujun Zou , Zhihao Xue , Yun Zi

Lightweight Adapter Tuning for Multilingual Speech Translation

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP. Adapter tuning consists in freezing pretrained parameters of a model and injecting lightweight modules between layers, resulting in the addition of…

Computation and Language · Computer Science 2021-07-14 Hang Le , Juan Pino , Changhan Wang , Jiatao Gu , Didier Schwab , Laurent Besacier

Tiny-Attention Adapter: Contexts Are More Important Than the Number of Parameters

Adapter-tuning is a paradigm that transfers a pretrained language model to downstream tasks by adding and tuning a small number of new parameters. Previously proposed adapter architectures are all feed-forward neural networks. In this…

Computation and Language · Computer Science 2022-11-04 Hongyu Zhao , Hao Tan , Hongyuan Mei