Related papers: Time-, Memory- and Parameter-Efficient Visual Adap…

Parameter-efficient is not sufficient: Exploring Parameter, Memory, and Time Efficient Adapter Tuning for Dense Predictions

Pre-training & fine-tuning is a prevalent paradigm in computer vision (CV). Recently, parameter-efficient transfer learning (PETL) methods have shown promising performance in adapting to downstream tasks with only a few trainable…

Computer Vision and Pattern Recognition · Computer Science 2023-11-29 Dongshuo Yin , Xueting Han , Bin Li , Hao Feng , Jing Bai

PVP: Pre-trained Visual Parameter-Efficient Tuning

Large-scale pre-trained transformers have demonstrated remarkable success in various computer vision tasks. However, it is still highly challenging to fully fine-tune these models for downstream tasks due to their high computational and…

Computer Vision and Pattern Recognition · Computer Science 2023-04-27 Zhao Song , Ke Yang , Naiyang Guan , Junjie Zhu , Peng Qiao , Qingyong Hu

Parameter Efficient Multimodal Transformers for Video Representation Learning

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model. However, due to the excessive memory…

Computer Vision and Pattern Recognition · Computer Science 2021-09-23 Sangho Lee , Youngjae Yu , Gunhee Kim , Thomas Breuel , Jan Kautz , Yale Song

Visual Prompt Tuning

The current modus operandi in adapting pre-trained models involves updating all the backbone parameters, ie, full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) as an efficient and effective alternative to full fine-tuning…

Computer Vision and Pattern Recognition · Computer Science 2022-07-21 Menglin Jia , Luming Tang , Bor-Chun Chen , Claire Cardie , Serge Belongie , Bharath Hariharan , Ser-Nam Lim

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition

Pretraining Vision Transformers (ViTs) has achieved great success in visual recognition. A following scenario is to adapt a ViT to various image and video recognition tasks. The adaptation is challenging because of heavy computation and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Shoufa Chen , Chongjian Ge , Zhan Tong , Jiangliu Wang , Yibing Song , Jue Wang , Ping Luo

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

Large-scale vision-language pre-trained models have shown promising transferability to various downstream tasks. As the size of these foundation models and the number of downstream tasks grow, the standard full fine-tuning paradigm becomes…

Computer Vision and Pattern Recognition · Computer Science 2023-05-23 Haoyu Lu , Yuqi Huo , Guoxing Yang , Zhiwu Lu , Wei Zhan , Masayoshi Tomizuka , Mingyu Ding

Parameter-Efficient Multi-Task Learning via Progressive Task-Specific Adaptation

Parameter-efficient fine-tuning methods have emerged as a promising solution for adapting pre-trained models to various downstream tasks. While these methods perform well in single-task learning, extending them to multi-task learning…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Neeraj Gangwar , Anshuka Rangi , Rishabh Deshmukh , Holakou Rahmanian , Yesh Dattatreya , Nickvash Kani

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning

Capitalizing on large pre-trained models for various downstream tasks of interest have recently emerged with promising performance. Due to the ever-growing model size, the standard full fine-tuning based task adaptation strategy becomes…

Computer Vision and Pattern Recognition · Computer Science 2022-10-14 Junting Pan , Ziyi Lin , Xiatian Zhu , Jing Shao , Hongsheng Li

Parameter-efficient Model Adaptation for Vision Transformers

In computer vision, it has achieved great transfer learning performance via adapting large-scale pretrained vision models (e.g., vision transformers) to downstream tasks. Common approaches for model adaptation either update all model…

Computer Vision and Pattern Recognition · Computer Science 2023-07-18 Xuehai He , Chunyuan Li , Pengchuan Zhang , Jianwei Yang , Xin Eric Wang

ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video

Adapting image models to the video domain has emerged as an efficient paradigm for solving video recognition tasks. Due to the huge number of parameters and effective transferability of image models, performing full fine-tuning is less…

Computer Vision and Pattern Recognition · Computer Science 2024-07-12 Xinhao Li , Yuhan Zhu , Limin Wang

AdaViT: Adaptive Vision Transformers for Efficient Image Recognition

Built on top of self-attention mechanisms, vision transformers have demonstrated remarkable performance on a variety of vision tasks recently. While achieving excellent performance, they still require relatively intensive computational cost…

Computer Vision and Pattern Recognition · Computer Science 2021-12-01 Lingchen Meng , Hengduo Li , Bor-Chun Chen , Shiyi Lan , Zuxuan Wu , Yu-Gang Jiang , Ser-Nam Lim

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

As the size of transformer-based models continues to grow, fine-tuning these large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. Parameter-efficient learning has been developed to reduce the…

Computer Vision and Pattern Recognition · Computer Science 2023-07-27 Cheng Han , Qifan Wang , Yiming Cui , Zhiwen Cao , Wenguan Wang , Siyuan Qi , Dongfang Liu

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames

Recently, temporal action detection (TAD) has seen significant performance improvement with end-to-end training. However, due to the memory bottleneck, only models with limited scales and limited data volumes can afford end-to-end training,…

Computer Vision and Pattern Recognition · Computer Science 2024-04-23 Shuming Liu , Chen-Lin Zhang , Chen Zhao , Bernard Ghanem

Towards Efficient Visual Adaption via Structural Re-parameterization

Parameter-efficient transfer learning (PETL) is an emerging research spot aimed at inexpensively adapting large-scale pre-trained models to downstream tasks. Recent advances have achieved great success in saving storage costs for various…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Gen Luo , Minglang Huang , Yiyi Zhou , Xiaoshuai Sun , Guannan Jiang , Zhiyu Wang , Rongrong Ji

Memory-based Parameter Adaptation

Deep neural networks have excelled on a wide range of problems, from vision to language and game playing. Neural networks very gradually incorporate information into weights as they process data, requiring very low learning rates. If the…

Machine Learning · Statistics 2018-03-01 Pablo Sprechmann , Siddhant M. Jayakumar , Jack W. Rae , Alexander Pritzel , Adrià Puigdomènech Badia , Benigno Uria , Oriol Vinyals , Demis Hassabis , Razvan Pascanu , Charles Blundell

Vision Backbone Efficient Selection for Image Classification in Low-Data Regimes

Transfer learning has become an essential tool in modern computer vision, allowing practitioners to leverage backbones, pretrained on large datasets, to train successful models from limited annotated data. Choosing the right backbone is…

Computer Vision and Pattern Recognition · Computer Science 2025-08-20 Joris Guerin , Shray Bansal , Amirreza Shaban , Paulo Mann , Harshvardhan Gazula

IV-tuning: Parameter-Efficient Transfer Learning for Infrared-Visible Tasks

Existing infrared and visible (IR-VIS) methods inherit the general representations of Pre-trained Visual Models (PVMs) to facilitate complementary learning. However, our analysis indicates that under the full fine-tuning paradigm, the…

Computer Vision and Pattern Recognition · Computer Science 2026-02-27 Yaming Zhang , Chenqiang Gao , Fangcen Liu , Junjie Guo , Lan Wang , Xinggan Peng , Deyu Meng

PELA: Learning Parameter-Efficient Models with Low-Rank Approximation

Applying a pre-trained large model to downstream tasks is prohibitive under resource-constrained conditions. Recent dominant approaches for addressing efficiency issues involve adding a few learnable parameters to the fixed backbone model.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-21 Yangyang Guo , Guangzhi Wang , Mohan Kankanhalli

EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones

The superior performance of modern deep networks usually comes with a costly training procedure. This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers). Our work is…

Computer Vision and Pattern Recognition · Computer Science 2023-08-17 Yulin Wang , Yang Yue , Rui Lu , Tianjiao Liu , Zhao Zhong , Shiji Song , Gao Huang

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

While parameter efficient tuning (PET) methods have shown great potential with transformer architecture on Natural Language Processing (NLP) tasks, their effectiveness with large-scale ConvNets is still under-studied on Computer Vision (CV)…

Computer Vision and Pattern Recognition · Computer Science 2024-04-15 Hao Chen , Ran Tao , Han Zhang , Yidong Wang , Xiang Li , Wei Ye , Jindong Wang , Guosheng Hu , Marios Savvides