English
Related papers

Related papers: HyperLoader: Integrating Hypernetwork-Based LoRA a…

200 papers

State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model. However, such modules are trained separately for each task and thus do not enable sharing…

Computation and Language · Computer Science 2021-06-09 Rabeeh Karimi Mahabadi , Sebastian Ruder , Mostafa Dehghani , James Henderson

We investigate input-conditioned hypernetworks for multi-tasking in NLP, generating parameter-efficient adaptations for a decoder using a hypernetwork conditioned on the output of an encoder. This approach produces a unique decoder…

Computation and Language · Computer Science 2022-10-19 Hamish Ivison , Matthew E. Peters

Modern Transformer-based models frequently suffer from miscalibration, producing overconfident predictions that do not reflect true empirical frequencies. This work investigates the calibration dynamics of LoRA: Low-Rank Adaptation and a…

Computation and Language · Computer Science 2026-03-31 Bartosz Trojan , Filip Gębala

Fine-tuning large language models for different tasks can be costly and inefficient, and even methods that reduce the number of tuned parameters still require full gradient-based optimization. We propose HyperTuning, a novel approach to…

Computation and Language · Computer Science 2022-11-23 Jason Phang , Yi Mao , Pengcheng He , Weizhu Chen

In this work we propose a HyperTransformer, a Transformer-based model for supervised and semi-supervised few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples. Since the dependence of…

Machine Learning · Computer Science 2022-07-15 Andrey Zhmoginov , Mark Sandler , Max Vladymyrov

Foundation models excel across diverse tasks, but adapting them to specialized applications often requires fine-tuning, an approach that is memory and compute-intensive. Parameter-efficient fine-tuning (PEFT) methods mitigate this by…

Machine Learning · Computer Science 2026-04-24 Abel Gurung , Joseph Campbell

There has been a significant increase in the deployment of neural network models, presenting substantial challenges in model adaptation and fine-tuning. Efficient adaptation is crucial in maintaining model performance across diverse tasks…

Machine Learning · Computer Science 2025-04-02 Maolin Wang , Xiangyu Zhao

The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient…

Computation and Language · Computer Science 2022-03-09 Zhengkun Zhang , Wenya Guo , Xiaojun Meng , Yasheng Wang , Yadao Wang , Xin Jiang , Qun Liu , Zhenglu Yang

Deploying natural language processing (NLP) models on mobile platforms requires models that can adapt across diverse applications while remaining efficient in memory and computation. We investigate pre-finetuning strategies to enhance the…

Computation and Language · Computer Science 2025-10-10 Junyi Zhu , Savas Ozkan , Andrea Maracani , Sinan Mutlu , Cho Jung Min , Mete Ozay

Real-world applications of object recognition often require the solution of multiple tasks in a single platform. Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is…

Computer Vision and Pattern Recognition · Computer Science 2019-07-02 Pedro Morgado , Nuno Vasconcelos

Parameter-efficient fine-tuning (PEFT) has emerged as a powerful paradigm for adapting large-scale pre-trained models to downstream tasks with minimal additional parameters. Among PEFT methods, Low-Rank Adaptation (LoRA) stands out for its…

Machine Learning · Computer Science 2026-02-03 Nghiem T. Diep , Dung Le , Tuan Truong , Tan Dinh , Huy Nguyen , Nhat Ho

Achieving state-of-the-art performance on natural language understanding tasks typically relies on fine-tuning a fresh model for every task. Consequently, this approach leads to a higher overall parameter cost, along with higher technical…

Computation and Language · Computer Science 2020-07-14 Yi Tay , Zhe Zhao , Dara Bahri , Donald Metzler , Da-Cheng Juan

Multi-Task Learning (MTL) networks have emerged as a promising method for transferring learned knowledge across different tasks. However, MTL must deal with challenges such as: overfitting to low resource tasks, catastrophic forgetting, and…

Machine Learning · Computer Science 2022-04-22 Jonathan Pilault , Amine Elhattami , Christopher Pal

Adapting large-scale pretrained language models to downstream tasks via fine-tuning is the standard method for achieving state-of-the-art performance on NLP benchmarks. However, fine-tuning all weights of models with millions or billions of…

Computation and Language · Computer Science 2021-11-30 Rabeeh Karimi Mahabadi , James Henderson , Sebastian Ruder

Low-Rank Adaptation (LoRA) has emerged as one of the most widely used parameter-efficient fine-tuning (PEFT) methods for adapting large language models (LLMs) to downstream tasks. While highly effective in single-task settings, it struggles…

Computation and Language · Computer Science 2025-10-14 Bo Cheng , Xu Wang , Jinda Liu , Yi Chang , Yuan Wu

The widespread utilization of language models in modern applications is inconceivable without Parameter-Efficient Fine-Tuning techniques, such as low-rank adaptation ($\texttt{LoRA}$), which adds trainable adapters to selected layers.…

Machine Learning · Computer Science 2025-10-17 Andrey Veprikov , Vladimir Solodkin , Alexander Zyl , Andrey Savchenko , Aleksandr Beznosikov

In this paper, we introduce Symmetric Low-Rank Adapters, an optimized variant of LoRA with even fewer weights. This method utilizes Low-Rank Symmetric Weight Matrices to learn downstream tasks more efficiently. Traditional LoRA accumulates…

Machine Learning · Computer Science 2025-04-17 Tales Panoutsos , Rodrygo L. T. Santos , Flavio Figueiredo

Hypernetworks are models that generate or modulate the weights of another network. They provide a flexible mechanism for injecting context and task conditioning and have proven broadly useful across diverse applications without significant…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Eli Passov , Nathan S. Netanyahu , Yosi Keller

Recent work on mode connectivity in the loss landscape of deep neural networks has demonstrated that the locus of (sub-)optimal weight vectors lies on continuous paths. In this work, we train a neural network that serves as a hypernetwork,…

Machine Learning · Statistics 2019-05-09 Lior Deutsch , Erik Nijkamp , Yu Yang

Prompt-Tuning is a new paradigm for finetuning pre-trained language models in a parameter-efficient way. Here, we explore the use of HyperNetworks to generate hyper-prompts: we propose HyperPrompt, a novel architecture for prompt-based…

Computation and Language · Computer Science 2022-06-16 Yun He , Huaixiu Steven Zheng , Yi Tay , Jai Gupta , Yu Du , Vamsi Aribandi , Zhe Zhao , YaGuang Li , Zhao Chen , Donald Metzler , Heng-Tze Cheng , Ed H. Chi
‹ Prev 1 2 3 10 Next ›