Related papers: Parameter-efficient Model Adaptation for Vision Tr…

Cross-Modal Adapter: Parameter-Efficient Transfer Learning Approach for Vision-Language Models

Adapter-based parameter-efficient transfer learning has achieved exciting results in vision-language models. Traditional adapter methods often require training or fine-tuning, facing challenges such as insufficient samples or resource…

Computer Vision and Pattern Recognition · Computer Science 2024-04-22 Juncheng Yang , Zuchao Li , Shuai Xie , Weiping Zhu , Wei Yu , Shijun Li

VMT-Adapter: Parameter-Efficient Transfer Learning for Multi-Task Dense Scene Understanding

Large-scale pre-trained models have achieved remarkable success in various computer vision tasks. A standard approach to leverage these models is to fine-tune all model parameters for downstream tasks, which poses challenges in terms of…

Computer Vision and Pattern Recognition · Computer Science 2023-12-18 Yi Xin , Junlong Du , Qiang Wang , Zhiwen Lin , Ke Yan

Parameter Efficient Multimodal Transformers for Video Representation Learning

The recent success of Transformers in the language domain has motivated adapting it to a multimodal setting, where a new visual model is trained in tandem with an already pretrained language model. However, due to the excessive memory…

Computer Vision and Pattern Recognition · Computer Science 2021-09-23 Sangho Lee , Youngjae Yu , Gunhee Kim , Thomas Breuel , Jan Kautz , Yale Song

Adapting Vision Transformer for Efficient Change Detection

Most change detection models based on vision transformers currently follow a "pretraining then fine-tuning" strategy. This involves initializing the model weights using large scale classification datasets, which can be either natural images…

Computer Vision and Pattern Recognition · Computer Science 2023-12-11 Yang Zhao , Yuxiang Zhang , Yanni Dong , Bo Du

Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

The advent of high-capacity pre-trained models has revolutionized problem-solving in computer vision, shifting the focus from training task-specific models to adapting pre-trained models. Consequently, effectively adapting large pre-trained…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Wei Dong , Dawei Yan , Zhijun Lin , Peng Wang

Cross-Modal Adapter for Vision-Language Retrieval

Vision-language retrieval is an important multi-modal learning topic, where the goal is to retrieve the most relevant visual candidate for a given text query. Recently, pre-trained models, e.g., CLIP, show great potential on retrieval…

Computer Vision and Pattern Recognition · Computer Science 2025-09-03 Haojun Jiang , Jianke Zhang , Rui Huang , Chunjiang Ge , Zanlin Ni , Shiji Song , Gao Huang

Prompt Tuning based Adapter for Vision-Language Model Adaption

Large pre-trained vision-language (VL) models have shown significant promise in adapting to various downstream tasks. However, fine-tuning the entire network is challenging due to the massive number of model parameters. To address this…

Computer Vision and Pattern Recognition · Computer Science 2023-03-28 Jingchen Sun , Jiayu Qin , Zihao Lin , Changyou Chen

Parameter-Efficient Active Learning for Foundational models

Foundational vision transformer models have shown impressive few shot performance on many vision tasks. This research presents a novel investigation into the application of parameter efficient fine-tuning methods within an active learning…

Computer Vision and Pattern Recognition · Computer Science 2024-06-17 Athmanarayanan Lakshmi Narayanan , Ranganath Krishnan , Amrutha Machireddy , Mahesh Subedar

ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning

Capitalizing on large pre-trained models for various downstream tasks of interest have recently emerged with promising performance. Due to the ever-growing model size, the standard full fine-tuning based task adaptation strategy becomes…

Computer Vision and Pattern Recognition · Computer Science 2022-10-14 Junting Pan , Ziyi Lin , Xiatian Zhu , Jing Shao , Hongsheng Li

Time-, Memory- and Parameter-Efficient Visual Adaptation

As foundation models become more popular, there is a growing need to efficiently finetune them for downstream tasks. Although numerous adaptation methods have been proposed, they are designed to be efficient only in terms of how many…

Computer Vision and Pattern Recognition · Computer Science 2024-02-06 Otniel-Bogdan Mercea , Alexey Gritsenko , Cordelia Schmid , Anurag Arnab

Parameter-Efficient Multi-Task Learning via Progressive Task-Specific Adaptation

Parameter-efficient fine-tuning methods have emerged as a promising solution for adapting pre-trained models to various downstream tasks. While these methods perform well in single-task learning, extending them to multi-task learning…

Computer Vision and Pattern Recognition · Computer Science 2026-04-28 Neeraj Gangwar , Anshuka Rangi , Rishabh Deshmukh , Holakou Rahmanian , Yesh Dattatreya , Nickvash Kani

Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets

While parameter efficient tuning (PET) methods have shown great potential with transformer architecture on Natural Language Processing (NLP) tasks, their effectiveness with large-scale ConvNets is still under-studied on Computer Vision (CV)…

Computer Vision and Pattern Recognition · Computer Science 2024-04-15 Hao Chen , Ran Tao , Han Zhang , Yidong Wang , Xiang Li , Wei Ye , Jindong Wang , Guosheng Hu , Marios Savvides

BLIP-Adapter: Parameter-Efficient Transfer Learning for Mobile Screenshot Captioning

This study aims to explore efficient tuning methods for the screenshot captioning task. Recently, image captioning has seen significant advancements, but research in captioning tasks for mobile screens remains relatively scarce. Current…

Machine Learning · Computer Science 2023-09-27 Ching-Yu Chiang , I-Hua Chang , Shih-Wei Liao

Kernel Modulation: A Parameter-Efficient Method for Training Convolutional Neural Networks

Deep Neural Networks, particularly Convolutional Neural Networks (ConvNets), have achieved incredible success in many vision tasks, but they usually require millions of parameters for good accuracy performance. With increasing applications…

Computer Vision and Pattern Recognition · Computer Science 2022-03-30 Yuhuang Hu , Shih-Chii Liu

AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition

Pretraining Vision Transformers (ViTs) has achieved great success in visual recognition. A following scenario is to adapt a ViT to various image and video recognition tasks. The adaptation is challenging because of heavy computation and…

Computer Vision and Pattern Recognition · Computer Science 2022-10-18 Shoufa Chen , Chongjian Ge , Zhan Tong , Jiangliu Wang , Yibing Song , Jue Wang , Ping Luo

A Comprehensive Study of Vision Transformers in Image Classification Tasks

Image Classification is a fundamental task in the field of computer vision that frequently serves as a benchmark for gauging advancements in Computer Vision. Over the past few years, significant progress has been made in image…

Computer Vision and Pattern Recognition · Computer Science 2023-12-06 Mahmoud Khalil , Ahmad Khalil , Alioune Ngom

E^2VPT: An Effective and Efficient Approach for Visual Prompt Tuning

As the size of transformer-based models continues to grow, fine-tuning these large-scale pretrained vision models for new tasks has become increasingly parameter-intensive. Parameter-efficient learning has been developed to reduce the…

Computer Vision and Pattern Recognition · Computer Science 2023-07-27 Cheng Han , Qifan Wang , Yiming Cui , Zhiwen Cao , Wenguan Wang , Siyuan Qi , Dongfang Liu

Towards Efficient Visual Adaption via Structural Re-parameterization

Parameter-efficient transfer learning (PETL) is an emerging research spot aimed at inexpensively adapting large-scale pre-trained models to downstream tasks. Recent advances have achieved great success in saving storage costs for various…

Computer Vision and Pattern Recognition · Computer Science 2023-03-22 Gen Luo , Minglang Huang , Yiyi Zhou , Xiaoshuai Sun , Guannan Jiang , Zhiyu Wang , Rongrong Ji

Improvise, Adapt, Overcome -- Telescopic Adapters for Efficient Fine-tuning of Vision Language Models in Medical Imaging

Adapting Vision Language Segmentation Models (VLSMs) to medical imaging domains requires significant computational overhead when using conventional fine-tuning approaches. Existing Parameter-Efficient Fine-Tuning (PEFT) methods apply…

Computer Vision and Pattern Recognition · Computer Science 2026-04-03 Ujjwal Mishra , Vinita Shukla , Praful Hambarde , Amit Shukla

Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models

Adapting large-scale pre-trained generative models in a parameter-efficient manner is gaining traction. Traditional methods like low rank adaptation achieve parameter efficiency by imposing constraints but may not be optimal for tasks…

Computer Vision and Pattern Recognition · Computer Science 2024-06-03 Xinxi Zhang , Song Wen , Ligong Han , Felix Juefei-Xu , Akash Srivastava , Junzhou Huang , Hao Wang , Molei Tao , Dimitris N. Metaxas