English
Related papers

Related papers: Adapter Pruning using Tropical Characterization

200 papers

State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of…

Computation and Language · Computer Science 2022-05-04 Nafise Sadat Moosavi , Quentin Delfosse , Kristian Kersting , Iryna Gurevych

Multilingual machine translation suffers from negative interference across languages. A common solution is to relax parameter sharing with language-specific modules like adapters. However, adapters of related languages are unable to…

Computation and Language · Computer Science 2022-12-06 Christos Baziotis , Mikel Artetxe , James Cross , Shruti Bhosale

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a…

Computation and Language · Computer Science 2021-06-08 Ruidan He , Linlin Liu , Hai Ye , Qingyu Tan , Bosheng Ding , Liying Cheng , Jia-Wei Low , Lidong Bing , Luo Si

Adapter Tuning, which freezes the pretrained language models (PLMs) and only fine-tunes a few extra modules, becomes an appealing efficient alternative to the full model fine-tuning. Although computationally efficient, the recent Adapters…

Computation and Language · Computer Science 2022-11-11 Shwai He , Liang Ding , Daize Dong , Miao Zhang , Dacheng Tao

Adapter modules were recently introduced as an efficient alternative to fine-tuning in NLP. Adapter tuning consists in freezing pretrained parameters of a model and injecting lightweight modules between layers, resulting in the addition of…

Computation and Language · Computer Science 2021-07-14 Hang Le , Juan Pino , Changhan Wang , Jiatao Gu , Didier Schwab , Laurent Besacier

NLP(natural language processsing) has achieved great success through the transformer model.However, the model has hundreds of millions or billions parameters,which is huge burden for its deployment on personal computer or small scale of…

Information Retrieval · Computer Science 2024-08-26 TianChen Wang

Adapters have been widely explored to alleviate computational and storage costs when fine-tuning pretrained foundation models. However, the adapter itself can exhibit redundancy, leading to unnecessary storage overhead and inferior…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Yibo Zhong , Yao Zhou

Transformer models have revolutionized natural language processing with their unparalleled ability to grasp complex contextual relationships. However, the vast number of parameters in these models has raised concerns regarding computational…

Machine Learning · Computer Science 2023-10-10 Sia Gholami , Marwan Omar

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we…

Adapting pre-trained neural models to downstream tasks has become the standard practice for obtaining high-quality models. In this work, we propose a novel model adaptation paradigm, adapting by pruning, which prunes neural connections in…

Machine Learning · Computer Science 2021-05-10 Yang Gao , Nicolo Colombo , Wei Wang

Transformer-based pre-trained models with millions of parameters require large storage. Recent approaches tackle this shortcoming by training adapters, but these approaches still require a relatively large number of parameters. In this…

Computation and Language · Computer Science 2023-01-31 Chin-Lun Fu , Zih-Ching Chen , Yun-Ru Lee , Hung-yi Lee

The Outstanding performance and growing size of Large Language Models has led to increased attention in parameter efficient learning. The two predominant approaches are Adapters and Pruning. Adapters are to freeze the model and give it a…

Computation and Language · Computer Science 2023-04-07 Guorun Wang , Jun Yang , Yaoru Sun

Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific…

Computation and Language · Computer Science 2021-08-06 Wenjuan Han , Bo Pang , Yingnian Wu

In this paper, we propose an adaptive pruning method. This method can cut off the channel and layer adaptively. The proportion of the layer and the channel to be cut is learned adaptively. The pruning method proposed in this paper can…

Machine Learning · Computer Science 2019-10-29 Weiwei Zhang , Changsheng chen , Xuechun Wu , Jialin Gao , Di Bao , Jiwei Li , Xi Zhou

Fine-tuning of self-supervised models is a powerful transfer learning method in a variety of fields, including speech processing, since it can utilize generic feature representations obtained from large amounts of unlabeled data.…

Multimedia · Computer Science 2022-12-07 Shinta Otake , Rei Kawakami , Nakamasa Inoue

Parameter-Efficient transfer learning with Adapters have been studied in Natural Language Processing (NLP) as an alternative to full fine-tuning. Adapters are memory-efficient and scale well with downstream tasks by training small…

Information Retrieval · Computer Science 2023-03-24 Vaishali Pal , Carlos Lassance , Hervé Déjean , Stéphane Clinchant

Adapters have been positioned as a parameter-efficient fine-tuning (PEFT) approach, whereby a minimal number of parameters are added to the model and fine-tuned. However, adapters have not been sufficiently analyzed to understand if PEFT…

Computation and Language · Computer Science 2023-05-15 Nandini Mundra , Sumanth Doddapaneni , Raj Dabre , Anoop Kunchukuttan , Ratish Puduppully , Mitesh M. Khapra

This paper proposes a method to effectively perform joint training-and-pruning based on adaptive dropout layers with unit-wise retention probabilities. The proposed method is based on the estimation of a unit-wise retention probability in a…

Computation and Language · Computer Science 2024-12-09 Yotaro Kubo , Xingyu Cai , Michiel Bacchiani

This work studies the long-standing problems of model capacity and negative interference in multilingual neural machine translation MNMT. We use network pruning techniques and observe that pruning 50-70% of the parameters from a trained…

Computation and Language · Computer Science 2021-07-21 Zeeshan Khan , Kartheek Akella , Vinay P. Namboodiri , C V Jawahar

To solve ever more complex problems, Deep Neural Networks are scaled to billions of parameters, leading to huge computational costs. An effective approach to reduce computational requirements and increase efficiency is to prune unnecessary…

‹ Prev 1 2 3 10 Next ›