Related papers: TAIL: Text-Audio Incremental Learning

Incremental Prototype Tuning for Class Incremental Learning

Class incremental learning(CIL) has attracted much attention, but most existing related works focus on fine-tuning the entire representation model, which inevitably results in much catastrophic forgetting. In the contrast, with a…

Computer Vision and Pattern Recognition · Computer Science 2023-02-10 Jieren Deng , Jianhua Hu , Haojian Zhang , Yunkuan Wang

Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss

In text-audio retrieval (TAR) tasks, due to the heterogeneity of contents between text and audio, the semantic information contained in the text is only similar to certain frames within the audio. Yet, existing works aggregate the entire…

Sound · Computer Science 2023-03-31 Yifei Xin , Dongchao Yang , Yuexian Zou

AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks

Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It…

Sound · Computer Science 2024-01-22 Yun Liang , Hai Lin , Shaojian Qiu , Yihang Zhang

Continual Adapter Tuning with Semantic Shift Compensation for Class-Incremental Learning

Class-incremental learning (CIL) aims to enable models to continuously learn new classes while overcoming catastrophic forgetting. The introduction of pre-trained models has brought new tuning paradigms to CIL. In this paper, we revisit…

Computer Vision and Pattern Recognition · Computer Science 2025-10-13 Qinhao Zhou , Yuwen Tan , Boqing Gong , Xiang Xiang

TEAL: New Selection Strategy for Small Buffers in Experience Replay Class Incremental Learning

Continual Learning is an unresolved challenge, whose relevance increases when considering modern applications. Unlike the human brain, trained deep neural networks suffer from a phenomenon called catastrophic forgetting, wherein they…

Machine Learning · Computer Science 2025-02-18 Shahar Shaul-Ariel , Daphna Weinshall

Achieving More with Less: Additive Prompt Tuning for Rehearsal-Free Class-Incremental Learning

Class-incremental learning (CIL) enables models to learn new classes progressively while preserving knowledge of previously learned ones. Recent advances in this field have shifted towards parameter-efficient fine-tuning techniques, with…

Computer Vision and Pattern Recognition · Computer Science 2025-08-13 Haoran Chen , Ping Wang , Zihan Zhou , Xu Zhang , Zuxuan Wu , Yu-Gang Jiang

When Parameter-efficient Tuning Meets General-purpose Vision-language Models

Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications. However,…

Computation and Language · Computer Science 2023-12-21 Yihang Zhai , Haixin Wang , Jianlong Chang , Xinlong Yang , Jinan Sun , Shikun Zhang , Qi Tian

TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models

The full potential of large pretrained models remains largely untapped in control domains like robotics. This is mainly because of the scarcity of data and the computational challenges associated with training or fine-tuning these large…

Machine Learning · Computer Science 2024-03-11 Zuxin Liu , Jesse Zhang , Kavosh Asadi , Yao Liu , Ding Zhao , Shoham Sabach , Rasool Fakoor

Incremental Learning of Acoustic Scenes and Sound Events

In this paper, we propose a method for incremental learning of two distinct tasks over time: acoustic scene classification (ASC) and audio tagging (AT). We use a simple convolutional neural network (CNN) model as an incremental learner to…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-25 Manjunath Mulimani , Annamaria Mesaros

CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations

Existing audio-language task-specific predictive approaches focus on building complicated late-fusion mechanisms. However, these models are facing challenges of overfitting with limited labels and low model generalization abilities. In this…

Sound · Computer Science 2021-09-02 Hang Li , Yu Kang , Tianqiao Liu , Wenbiao Ding , Zitao Liu

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

Parameter-efficient transfer learning (PETL) methods have emerged as a solid alternative to the standard full fine-tuning approach. They only train a few extra parameters for each downstream task, without sacrificing performance and…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-16 Umberto Cappellazzo , Daniele Falavigna , Alessio Brutti , Mirco Ravanelli

Adapter Incremental Continual Learning of Efficient Audio Spectrogram Transformers

Continual learning involves training neural networks incrementally for new tasks while retaining the knowledge of previous tasks. However, efficiently fine-tuning the model for sequential tasks with minimal computational resources remains a…

Sound · Computer Science 2024-01-03 Nithish Muthuchamy Selvaraj , Xiaobao Guo , Adams Kong , Bingquan Shen , Alex Kot

Instance-wise Prompt Tuning for Pretrained Language Models

Prompt Learning has recently gained great popularity in bridging the gap between pretraining tasks and various downstream tasks. It freezes Pretrained Language Models (PLMs) and only tunes a few task-related parameters (prompts) for…

Computation and Language · Computer Science 2022-06-07 Yuezihan Jiang , Hao Yang , Junyang Lin , Hanyu Zhao , An Yang , Chang Zhou , Hongxia Yang , Zhi Yang , Bin Cui

IAP: Improving Continual Learning of Vision-Language Models via Instance-Aware Prompting

Recent pre-trained vision-language models (PT-VLMs) often face a Multi-Domain Task Incremental Learning (MTIL) scenario in practice, where several classes and domains of multi-modal tasks are incrementally arrived. Without access to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Hao Fu , Hanbin Zhao , Jiahua Dong , Henghui Ding , Chao Zhang , Hui Qian

Continual Learning on CLIP via Incremental Prompt Tuning with Intrinsic Textual Anchors

Continual learning (CL) enables deep networks to acquire new knowledge while avoiding catastrophic forgetting. The powerful generalization ability of pre-trained models (PTMs), such as the Contrastive Language-Image Pre-training (CLIP)…

Computer Vision and Pattern Recognition · Computer Science 2025-12-22 Haodong Lu , Xinyu Zhang , Kristen Moore , Jason Xue , Lina Yao , Anton van den Hengel , Dong Gong

Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning

Class Incremental Learning (CIL) aims to continuously learn new categories while retaining the knowledge of old ones. Pre-trained models (PTMs) show promising capabilities in CIL. However, existing approaches that apply lightweight…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Kai Jiang , Zhengyan Shi , Dell Zhang , Hongyuan Zhang , Xuelong Li

Active Learning Based Fine-Tuning Framework for Speech Emotion Recognition

Speech emotion recognition (SER) has drawn increasing attention for its applications in human-machine interaction. However, existing SER methods ignore the information gap between the pre-training speech recognition task and the downstream…

Sound · Computer Science 2023-10-03 Dongyuan Li , Yusong Wang , Kotaro Funakoshi , Manabu Okumura

Retrieval-Augmented Text-to-Audio Generation

Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such as AudioCaps, are biased in their generation performance.…

Sound · Computer Science 2024-01-08 Yi Yuan , Haohe Liu , Xubo Liu , Qiushi Huang , Mark D. Plumbley , Wenwu Wang

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

The advent of hyper-scale and general-purpose pre-trained models is shifting the paradigm of building task-specific models for target tasks. In the field of audio research, task-agnostic pre-trained models with high transferability and…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-03 Ju-ho Kim , Jungwoo Heo , Hyun-seo Shin , Chan-yeong Lim , Ha-Jin Yu

Learning a Better Initialization for Soft Prompts via Meta-Learning

Prompt tuning (PT) is an effective approach to adapting pre-trained language models to downstream tasks. Without a good initialization, prompt tuning doesn't perform well under few-shot settings. So pre-trained prompt tuning (PPT) is…

Computation and Language · Computer Science 2022-05-26 Yukun Huang , Kun Qian , Zhou Yu