English
Related papers

Related papers: Parameter-Efficient Transfer Learning with Diff Pr…

200 papers

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we…

Fine-tuning large pre-trained language models on downstream tasks has become the de-facto learning paradigm in NLP. However, conventional approaches fine-tune all the parameters of the pre-trained model, which becomes prohibitive as the…

Computation and Language · Computer Science 2022-02-03 Junxian He , Chunting Zhou , Xuezhe Ma , Taylor Berg-Kirkpatrick , Graham Neubig

Parameter fine tuning is a transfer learning approach whereby learned parameters from pre-trained source network are transferred to the target network followed by fine-tuning. Prior research has shown that this approach is capable of…

Computer Vision and Pattern Recognition · Computer Science 2019-09-20 Tasfia Shermin , Shyh Wei Teng , Manzur Murshed , Guojun Lu , Ferdous Sohel , Manoranjan Paul

Efficient finetuning of pretrained language transformers is becoming increasingly prevalent for solving natural language processing tasks. While effective, it can still require a large number of tunable parameters. This can be a drawback…

Computation and Language · Computer Science 2023-05-31 Umang Gupta , Aram Galstyan , Greg Ver Steeg

Transformer-based pre-trained language models have significantly improved the performance of various natural language processing (NLP) tasks in the recent years. While effective and prevalent, these models are usually prohibitively large…

Computation and Language · Computer Science 2022-01-19 Dongkuan Xu , Ian E. H. Yen , Jinxi Zhao , Zhibin Xiao

Most uses of machine learning today involve training a model from scratch for a particular task, or sometimes starting with a model pretrained on a related task and then fine-tuning on a downstream task. Both approaches offer limited…

Machine Learning · Computer Science 2022-05-26 Andrea Gesmundo , Jeff Dean

State-of-the-art parameter-efficient fine-tuning methods rely on introducing adapter modules between the layers of a pretrained language model. However, such modules are trained separately for each task and thus do not enable sharing…

Computation and Language · Computer Science 2021-06-09 Rabeeh Karimi Mahabadi , Sebastian Ruder , Mostafa Dehghani , James Henderson

We introduce a novel method that enables parameter-efficient transfer and multi-task learning with deep neural networks. The basic approach is to learn a model patch - a small set of parameters - that will specialize to each task, instead…

Machine Learning · Computer Science 2019-02-26 Pramod Kaushik Mudrakarta , Mark Sandler , Andrey Zhmoginov , Andrew Howard

Finetuning a pretrained model has become a standard approach for training neural networks on novel tasks, resulting in fast convergence and improved performance. In this work, we study an alternative finetuning method, where instead of…

Machine Learning · Computer Science 2023-07-04 Gal Kaplun , Andrey Gurevich , Tal Swisa , Mazor David , Shai Shalev-Shwartz , Eran Malach

In recent years, deep neural networks have known a wide success in various application domains. However, they require important computational and memory resources, which severely hinders their deployment, notably on mobile devices or for…

Computer Vision and Pattern Recognition · Computer Science 2021-12-16 Nathan Hubens , Matei Mancas , Bernard Gosselin , Marius Preda , Titus Zaharia

Neural network pruning is a popular technique used to reduce the inference costs of modern, potentially overparameterized, networks. Starting from a pre-trained network, the process is as follows: remove redundant parameters, retrain, and…

Machine Learning · Computer Science 2021-03-05 Lucas Liebenwein , Cenk Baykal , Brandon Carter , David Gifford , Daniela Rus

Large, pre-trained models are problematic to use in resource constrained applications. Fortunately, task-aware structured pruning methods offer a solution. These approaches reduce model size by dropping structural units like layers and…

Computation and Language · Computer Science 2023-11-14 Lucio Dery , David Grangier , Awni Hannun

Foundation models and their checkpoints have significantly advanced deep learning, boosting performance across various applications. However, fine-tuned models often struggle outside their specific domains and exhibit considerable…

The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient…

Computation and Language · Computer Science 2022-03-09 Zhengkun Zhang , Wenya Guo , Xiaojun Meng , Yasheng Wang , Yadao Wang , Xin Jiang , Qun Liu , Zhenglu Yang

Prompt tuning, in which a base pretrained model is adapted to each task via conditioning on learned prompt vectors, has emerged as a promising approach for efficiently adapting large language models to multiple downstream tasks. However,…

Computation and Language · Computer Science 2023-03-07 Zhen Wang , Rameswar Panda , Leonid Karlinsky , Rogerio Feris , Huan Sun , Yoon Kim

The pretrain-finetune paradigm usually improves downstream performance over training a model from scratch on the same task, becoming commonplace across many areas of machine learning. While pretraining is empirically observed to be…

Computer Vision and Pattern Recognition · Computer Science 2023-07-13 Gabriele Merlin , Vedant Nanda , Ruchit Rawal , Mariya Toneva

Fine-tuning is a popular way of exploiting knowledge contained in a pre-trained convolutional network for a new visual recognition task. However, the orthogonal setting of transferring knowledge from a pretrained network to a visually…

Computer Vision and Pattern Recognition · Computer Science 2020-08-28 Amelie Royer , Christoph H. Lampert

Recent works on parameter-efficient transfer learning (PETL) show the potential to adapt a pre-trained Vision Transformer to downstream recognition tasks with only a few learnable parameters. However, since they usually insert new…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Taolin Zhang , Jiawang Bai , Zhihe Lu , Dongze Lian , Genping Wang , Xinchao Wang , Shu-Tao Xia

In this paper we present an alternative strategy for fine-tuning the parameters of a network. We named the technique Gradual Tuning. Once trained on a first task, the network is fine-tuned on a second task by modifying a progressively…

Artificial Intelligence · Computer Science 2017-11-29 Guglielmo Montone , J. Kevin O'Regan , Alexander V. Terekhov

Although multi-task deep neural network (DNN) models have computation and storage benefits over individual single-task DNN models, they can be further optimized via model compression. Numerous structured pruning methods are already…

Machine Learning · Computer Science 2023-04-17 Siddhant Garg , Lijun Zhang , Hui Guan
‹ Prev 1 2 3 10 Next ›