English
Related papers

Related papers: Backbone Augmented Training for Adaptations

200 papers

Adversarial training (AT) with projected gradient descent is the most popular method to improve model robustness under adversarial attacks. However, computational overheads become prohibitively large when AT is applied to large backbone…

Machine Learning · Computer Science 2025-08-26 Quanwei Wu , Jun Guo , Wei Wang , Yi Wang

Personalizing diffusion models using limited data presents significant challenges, including overfitting, loss of prior knowledge, and degradation of text alignment. Overfitting leads to shifts in the noise prediction distribution,…

Computer Vision and Pattern Recognition · Computer Science 2025-07-04 JungWoo Chae , Jiyoon Kim , JaeWoong Choi , Kyungyul Kim , Sangheum Hwang

Newly-introduced deep learning architectures, namely BERT, XLNet, RoBERTa and ALBERT, have been proved to be robust on several NLP tasks. However, the datasets trained on these architectures are fixed in terms of size and generalizability.…

Computation and Language · Computer Science 2020-09-29 Jean-Philippe Corbeil , Hadi Abdi Ghadivel

Modern retrieval system often requires recomputing the representation of every piece of data in the gallery when updating to a better representation model. This process is known as backfilling and can be especially costly in the real world…

Computer Vision and Pattern Recognition · Computer Science 2023-08-29 Yifei Zhou , Zilu Li , Abhinav Shrivastava , Hengshuang Zhao , Antonio Torralba , Taipeng Tian , Ser-Nam Lim

Adversarial training (AT) aims to improve the robustness of deep learning models by mixing clean data and adversarial examples (AEs). Most existing AT approaches can be grouped into restricted and unrestricted approaches. Restricted AT…

Machine Learning · Computer Science 2020-04-14 Haidong Xie , Xueshuang Xiang , Naijin Liu , Bin Dong

Language models (LMs) pretrained on a large text corpus and fine-tuned on a downstream text corpus and fine-tuned on a downstream task becomes a de facto training strategy for several natural language processing (NLP) tasks. Recently, an…

Computation and Language · Computer Science 2021-07-23 Junghoon Lee , Jounghee Kim , Pilsung Kang

It is today acknowledged that neural network language models outperform backoff language models in applications like speech recognition or statistical machine translation. However, training these models on large amounts of data can take…

Neural and Evolutionary Computing · Computer Science 2015-07-08 Aram Ter-Sarkisov , Holger Schwenk , Loic Barrault , Fethi Bougares

Recently, Transformers have been introduced into the field of acoustics recognition. They are pre-trained on large-scale datasets using methods such as supervised learning and semi-supervised learning, demonstrating robust generality--It…

Sound · Computer Science 2024-01-22 Yun Liang , Hai Lin , Shaojian Qiu , Yihang Zhang

Alignment training is crucial for enabling large language models (LLMs) to cater to human intentions and preferences. It is typically performed based on two stages with different objectives: instruction-following alignment and…

Computation and Language · Computer Science 2024-06-24 Chenglong Wang , Hang Zhou , Kaiyan Chang , Bei Li , Yongyu Mu , Tong Xiao , Tongran Liu , Jingbo Zhu

This paper presents a novel approach to enhance the Binary-Addition-Tree algorithm (BAT) by integrating incremental learning techniques. BAT, known for its simplicity in development, implementation, and application, is a powerful implicit…

Machine Learning · Computer Science 2024-09-25 Wei-Chang Yeh

We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data…

Machine Learning · Computer Science 2025-01-23 Mingqi Yuan , Bo Li , Xin Jin , Wenjun Zeng

Recent work on time-series models has leveraged self-supervised training to learn meaningful features and patterns in order to improve performance on downstream tasks and generalize to unseen modalities. While these pretraining methods have…

Machine Learning · Computer Science 2026-04-10 Paul Quinlan , Qingguo Li , Xiaodan Zhu

Test-time adaptation (TTA) allows a model to be adapted to an unseen domain without accessing the source data. Due to the nature of practical environments, TTA has a limited amount of data for adaptation. Recent TTA methods further restrict…

Computer Vision and Pattern Recognition · Computer Science 2024-10-21 Younggeol Cho , Youngrae Kim , Junho Yoon , Seunghoon Hong , Dongman Lee

Vision-language foundation models have been incredibly successful in a wide range of downstream computer vision tasks using adaptation methods. However, due to the high cost of obtaining pre-training datasets, pairs with weak image-text…

Computer Vision and Pattern Recognition · Computer Science 2024-09-27 Wenshuo Peng , Kaipeng Zhang , Yue Yang , Hao Zhang , Yu Qiao

Fine-tuning and inference with large Language Models (LM) are generally known to be expensive. Parameter-efficient fine-tuning over pretrained LMs reduces training memory by updating a small number of LM parameters but does not improve…

Computation and Language · Computer Science 2024-06-05 Bowen Zhao , Hannaneh Hajishirzi , Qingqing Cao

Multimodal learning pipelines have benefited from the success of pretrained language models. However, this comes at the cost of increased model parameters. In this work, we propose Adapted Multimodal BERT (AMB), a BERT-based architecture…

Computation and Language · Computer Science 2022-12-02 Odysseas S. Chlapanis , Georgios Paraskevopoulos , Alexandros Potamianos

We introduce RE-Adapt, an approach to fine-tuning large language models on new domains without degrading any pre-existing instruction-tuning. We reverse engineer an adapter which isolates what an instruction-tuned model has learned beyond…

Computation and Language · Computer Science 2024-05-27 William Fleshman , Benjamin Van Durme

Deep learning has performed remarkably well on many tasks recently. However, the superior performance of deep models relies heavily on the availability of a large number of training data, which limits the wide adaptation of deep models on…

Machine Learning · Computer Science 2022-10-14 Huiyuan Yang , Han Yu , Akane Sano

Machine learning has achieved great success in electroencephalogram (EEG) based brain-computer interfaces (BCIs). Most existing BCI studies focused on improving the decoding accuracy, with only a few considering the adversarial security.…

Human-Computer Interaction · Computer Science 2024-11-05 Xiaoqing Chen , Ziwei Wang , Dongrui Wu

Data augmentation is practically helpful for visual recognition, especially at the time of data scarcity. However, such success is only limited to quite a few light augmentations (e.g., random crop, flip). Heavy augmentations are either…

Computer Vision and Pattern Recognition · Computer Science 2023-03-17 Yalong Bai , Mohan Zhou , Wei Zhang , Bowen Zhou , Tao Mei
‹ Prev 1 2 3 10 Next ›