Related papers: HyperLoader: Integrating Hypernetwork-Based LoRA a…

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model. However, such modules are trained separately for each task and thus do not enable sharing…

Computation and Language · Computer Science 2021-06-09 Rabeeh Karimi Mahabadi , Sebastian Ruder , Mostafa Dehghani , James Henderson

Hyperdecoders: Instance-specific decoders for multi-task NLP

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This approach produces a unique decoder…

Computation and Language · Computer Science 2022-10-19 Hamish Ivison , Matthew E. Peters

HypeLoRA: Hyper-Network-Generated LoRA Adapters for Calibrated Language Model Fine-Tuning

Modern Transformer-based models frequently suffer from miscalibration, producing overconfident predictions that do not reflect true empirical frequencies. This work investigates the calibration dynamics of LoRA: Low-Rank Adaptation and a…

Computation and Language · Computer Science 2026-03-31 Bartosz Trojan , Filip Gębala

HyperTuning: Toward Adapting Large Language Models without Back-propagation

Fine-tuning large language models for different tasks can be costly and inefficient, and even methods that reduce the number of tuned parameters still require full gradient-based optimization. We propose HyperTuning, a novel approach to…

Computation and Language · Computer Science 2022-11-23 Jason Phang , Yi Mao , Pengcheng He , Weizhu Chen

HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples. Since the dependence of…

Machine Learning · Computer Science 2022-07-15 Andrey Zhmoginov , Mark Sandler , Max Vladymyrov

HyperAdapt: Simple High-Rank Adaptation

Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by…

Machine Learning · Computer Science 2026-04-24 Abel Gurung , Joseph Campbell

MetaLoRA: Tensor-Enhanced Adaptive Low-Rank Fine-tuning

There has been a significant increase in the deployment of neural network models, presenting substantial challenges in model adaptation and fine-tuning. Efficient adaptation is crucial in maintaining model performance across diverse tasks…

Machine Learning · Computer Science 2025-04-02 Maolin Wang , Xiangyu Zhao

HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks

The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient…

Computation and Language · Computer Science 2022-03-09 Zhengkun Zhang , Wenya Guo , Xiaojun Meng , Yasheng Wang , Yadao Wang , Xin Jiang , Qun Liu , Zhenglu Yang

Multi-Task Pre-Finetuning of Lightweight Transformer Encoders for Text Classification and NER

Deploying natural language processing (NLP) models on mobile platforms requires models that can adapt across diverse applications while remaining efficient in memory and computation. We investigate pre-finetuning strategies to enhance the…

Computation and Language · Computer Science 2025-10-10 Junyi Zhu , Savas Ozkan , Andrea Maracani , Sinan Mutlu , Cho Jung Min , Mete Ozay

NetTailor: Tuning the Architecture, Not Just the Weights

Real-world applications of object recognition often require the solution of multiple tasks in a single platform. Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-02 Pedro Morgado , Nuno Vasconcelos

Hypernetwork-Driven Low-Rank Adaptation Across Attention Heads

Parameter-efficient fine-tuning (PEFT) has emerged as a powerful paradigm for adapting large-scale pre-trained models to downstream tasks with minimal additional parameters. Among PEFT methods, Low-Rank Adaptation (LoRA) stands out for its…

Machine Learning · Computer Science 2026-02-03 Nghiem T. Diep , Dung Le , Tuan Truong , Tan Dinh , Huy Nguyen , Nhat Ho

HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections

Achieving state-of-the-art performance on natural language understanding tasks typically relies on fine-tuning a fresh model for every task. Consequently, this approach leads to a higher overall parameter cost, along with higher technical…

Computation and Language · Computer Science 2020-07-14 Yi Tay , Zhe Zhao , Dara Bahri , Donald Metzler , Da-Cheng Juan

Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data

Multi-Task Learning (MTL) networks have emerged as a promising method for transferring learned knowledge across different tasks. However, MTL must deal with challenges such as: overfitting to low resource tasks, catastrophic forgetting, and…

Machine Learning · Computer Science 2022-04-22 Jonathan Pilault , Amine Elhattami , Christopher Pal

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Adapting large-scale pretrained language models to downstream tasks via fine-tuning is the standard method for achieving state-of-the-art performance on NLP benchmarks. However, fine-tuning all weights of models with millions or billions of…

Computation and Language · Computer Science 2021-11-30 Rabeeh Karimi Mahabadi , James Henderson , Sebastian Ruder

MeTA-LoRA: Data-Efficient Multi-Task Fine-Tuning for Large Language Models

Low-Rank Adaptation (LoRA) has emerged as one of the most widely used parameter-efficient fine-tuning (PEFT) methods for adapting large language models (LLMs) to downstream tasks. While highly effective in single-task settings, it struggles…

Computation and Language · Computer Science 2025-10-14 Bo Cheng , Xu Wang , Jinda Liu , Yi Chang , Yuan Wu

WeightLoRA: Keep Only Necessary Adapters

The widespread utilization of language models in modern applications is inconceivable without Parameter-Efficient Fine-Tuning techniques, such as low-rank adaptation ($\texttt{LoRA}$), which adds trainable adapters to selected layers.…

Machine Learning · Computer Science 2025-10-17 Andrey Veprikov , Vladimir Solodkin , Alexander Zyl , Andrey Savchenko , Aleksandr Beznosikov

Towards Symmetric Low-Rank Adapters

In this paper, we introduce Symmetric Low-Rank Adapters, an optimized variant of LoRA with even fewer weights. This method utilizes Low-Rank Symmetric Weight Matrices to learn downstream tasks more efficiently. Traditional LoRA accumulates…

Machine Learning · Computer Science 2025-04-17 Tales Panoutsos , Rodrygo L. T. Santos , Flavio Figueiredo

Multi-Sensor Matching with HyperNetworks

Hypernetworks are models that generate or modulate the weights of another network. They provide a flexible mechanism for injecting context and task conditioning and have proven broadly useful across diverse applications without significant…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Eli Passov , Nathan S. Netanyahu , Yosi Keller

A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks

Recent work on mode connectivity in the loss landscape of deep neural networks has demonstrated that the locus of (sub-)optimal weight vectors lies on continuous paths. In this work, we train a neural network that serves as a hypernetwork,…

Machine Learning · Statistics 2019-05-09 Lior Deutsch , Erik Nijkamp , Yu Yang

HyperPrompt: Prompt-based Task-Conditioning of Transformers

Prompt-Tuning is a new paradigm for finetuning pre-trained language models in a parameter-efficient way. Here, we explore the use of HyperNetworks to generate hyper-prompts: we propose HyperPrompt, a novel architecture for prompt-based…

Computation and Language · Computer Science 2022-06-16 Yun He , Huaixiu Steven Zheng , Yi Tay , Jai Gupta , Yu Du , Vamsi Aribandi , Zhe Zhao , YaGuang Li , Zhao Chen , Donald Metzler , Heng-Tze Cheng , Ed H. Chi