English
Related papers

Related papers: TAIL: Text-Audio Incremental Learning

200 papers

Class incremental learning(CIL) has attracted much attention, but most existing related works focus on fine-tuning the entire representation model, which inevitably results in much catastrophic forgetting. In the contrast, with a…

Computer Vision and Pattern Recognition · Computer Science 2023-02-10 Jieren Deng , Jianhua Hu , Haojian Zhang , Yunkuan Wang

In text-audio retrieval (TAR) tasks, due to the heterogeneity of contents between text and audio, the semantic information contained in the text is only similar to certain frames within the audio. Yet, existing works aggregate the entire…

Sound · Computer Science 2023-03-31 Yifei Xin , Dongchao Yang , Yuexian Zou

Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It…

Sound · Computer Science 2024-01-22 Yun Liang , Hai Lin , Shaojian Qiu , Yihang Zhang

Class-incremental learning (CIL) aims to enable models to continuously learn new classes while overcoming catastrophic forgetting. The introduction of pre-trained models has brought new tuning paradigms to CIL. In this paper, we revisit…

Computer Vision and Pattern Recognition · Computer Science 2025-10-13 Qinhao Zhou , Yuwen Tan , Boqing Gong , Xiang Xiang

Continual Learning is an unresolved challenge, whose relevance increases when considering modern applications. Unlike the human brain, trained deep neural networks suffer from a phenomenon called catastrophic forgetting, wherein they…

Machine Learning · Computer Science 2025-02-18 Shahar Shaul-Ariel , Daphna Weinshall

Class-incremental learning (CIL) enables models to learn new classes progressively while preserving knowledge of previously learned ones. Recent advances in this field have shifted towards parameter-efficient fine-tuning techniques, with…

Computer Vision and Pattern Recognition · Computer Science 2025-08-13 Haoran Chen , Ping Wang , Zihan Zhou , Xu Zhang , Zuxuan Wu , Yu-Gang Jiang

Instruction tuning has shown promising potential for developing general-purpose AI capabilities by using large-scale pre-trained models and boosts growing research to integrate multimodal information for creative applications. However,…

Computation and Language · Computer Science 2023-12-21 Yihang Zhai , Haixin Wang , Jianlong Chang , Xinlong Yang , Jinan Sun , Shikun Zhang , Qi Tian

The full potential of large pretrained models remains largely untapped in control domains like robotics. This is mainly because of the scarcity of data and the computational challenges associated with training or fine-tuning these large…

Machine Learning · Computer Science 2024-03-11 Zuxin Liu , Jesse Zhang , Kavosh Asadi , Yao Liu , Ding Zhao , Shoham Sabach , Rasool Fakoor

In this paper, we propose a method for incremental learning of two distinct tasks over time: acoustic scene classification (ASC) and audio tagging (AT). We use a simple convolutional neural network (CNN) model as an incremental learner to…

Audio and Speech Processing · Electrical Eng. & Systems 2023-08-25 Manjunath Mulimani , Annamaria Mesaros

Existing audio-language task-specific predictive approaches focus on building complicated late-fusion mechanisms. However, these models are facing challenges of overfitting with limited labels and low model generalization abilities. In this…

Sound · Computer Science 2021-09-02 Hang Li , Yu Kang , Tianqiao Liu , Wenbiao Ding , Zitao Liu

Parameter-efficient transfer learning (PETL) methods have emerged as a solid alternative to the standard full fine-tuning approach. They only train a few extra parameters for each downstream task, without sacrificing performance and…

Audio and Speech Processing · Electrical Eng. & Systems 2024-07-16 Umberto Cappellazzo , Daniele Falavigna , Alessio Brutti , Mirco Ravanelli

Continual learning involves training neural networks incrementally for new tasks while retaining the knowledge of previous tasks. However, efficiently fine-tuning the model for sequential tasks with minimal computational resources remains a…

Sound · Computer Science 2024-01-03 Nithish Muthuchamy Selvaraj , Xiaobao Guo , Adams Kong , Bingquan Shen , Alex Kot

Prompt Learning has recently gained great popularity in bridging the gap between pretraining tasks and various downstream tasks. It freezes Pretrained Language Models (PLMs) and only tunes a few task-related parameters (prompts) for…

Computation and Language · Computer Science 2022-06-07 Yuezihan Jiang , Hao Yang , Junyang Lin , Hanyu Zhao , An Yang , Chang Zhou , Hongxia Yang , Zhi Yang , Bin Cui

Recent pre-trained vision-language models (PT-VLMs) often face a Multi-Domain Task Incremental Learning (MTIL) scenario in practice, where several classes and domains of multi-modal tasks are incrementally arrived. Without access to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-08 Hao Fu , Hanbin Zhao , Jiahua Dong , Henghui Ding , Chao Zhang , Hui Qian

Continual learning (CL) enables deep networks to acquire new knowledge while avoiding catastrophic forgetting. The powerful generalization ability of pre-trained models (PTMs), such as the Contrastive Language-Image Pre-training (CLIP)…

Computer Vision and Pattern Recognition · Computer Science 2025-12-22 Haodong Lu , Xinyu Zhang , Kristen Moore , Jason Xue , Lina Yao , Anton van den Hengel , Dong Gong

Class Incremental Learning (CIL) aims to continuously learn new categories while retaining the knowledge of old ones. Pre-trained models (PTMs) show promising capabilities in CIL. However, existing approaches that apply lightweight…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Kai Jiang , Zhengyan Shi , Dell Zhang , Hongyuan Zhang , Xuelong Li

Speech emotion recognition (SER) has drawn increasing attention for its applications in human-machine interaction. However, existing SER methods ignore the information gap between the pre-training speech recognition task and the downstream…

Sound · Computer Science 2023-10-03 Dongyuan Li , Yusong Wang , Kotaro Funakoshi , Manabu Okumura

Despite recent progress in text-to-audio (TTA) generation, we show that the state-of-the-art models, such as AudioLDM, trained on datasets with an imbalanced class distribution, such as AudioCaps, are biased in their generation performance.…

Sound · Computer Science 2024-01-08 Yi Yuan , Haohe Liu , Xubo Liu , Qiushi Huang , Mark D. Plumbley , Wenwu Wang

The advent of hyper-scale and general-purpose pre-trained models is shifting the paradigm of building task-specific models for target tasks. In the field of audio research, task-agnostic pre-trained models with high transferability and…

Audio and Speech Processing · Electrical Eng. & Systems 2023-03-03 Ju-ho Kim , Jungwoo Heo , Hyun-seo Shin , Chan-yeong Lim , Ha-Jin Yu

Prompt tuning (PT) is an effective approach to adapting pre-trained language models to downstream tasks. Without a good initialization, prompt tuning doesn't perform well under few-shot settings. So pre-trained prompt tuning (PPT) is…

Computation and Language · Computer Science 2022-05-26 Yukun Huang , Kun Qian , Zhou Yu
‹ Prev 1 2 3 10 Next ›