Related papers: Efficient Attribute Injection for Pretrained Langu…

On the Effectiveness of Adapter-based Tuning for Pretrained Language Model Adaptation

Adapter-based tuning has recently arisen as an alternative to fine-tuning. It works by adding light-weight adapter modules to a pretrained language model (PrLM) and only updating the parameters of adapter modules when learning on a…

Computation and Language · Computer Science 2021-06-08 Ruidan He , Linlin Liu , Hai Ye , Qingyu Tan , Bosheng Ding , Liying Cheng , Jia-Wei Low , Lidong Bing , Luo Si

Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models

Parameter-shared pre-trained language models (PLMs) have emerged as a successful approach in resource-constrained environments, enabling substantial reductions in model storage and memory costs without significant performance compromise.…

Computation and Language · Computer Science 2023-10-20 Weize Chen , Xiaoyue Xu , Xu Han , Yankai Lin , Ruobing Xie , Zhiyuan Liu , Maosong Sun , Jie Zhou

APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference

Fine-tuning and inference with large Language Models (LM) are generally known to be expensive. Parameter-efficient fine-tuning over pretrained LMs reduces training memory by updating a small number of LM parameters but does not improve…

Computation and Language · Computer Science 2024-06-05 Bowen Zhao , Hannaneh Hajishirzi , Qingqing Cao

Improving Language Plasticity via Pretraining with Active Forgetting

Pretrained language models (PLMs) are today the primary model for natural language processing. Despite their impressive downstream performance, it can be difficult to apply PLMs to new languages, a barrier to making their capabilities…

Computation and Language · Computer Science 2024-01-15 Yihong Chen , Kelly Marchisio , Roberta Raileanu , David Ifeoluwa Adelani , Pontus Stenetorp , Sebastian Riedel , Mikel Artetxe

Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed Representations

Due to the huge amount of parameters, fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios. In this work, we present a novel method that operates on the hidden representations of a PLM to…

Computation and Language · Computer Science 2023-05-29 Linlin Liu , Xingxuan Li , Megh Thakkar , Xin Li , Shafiq Joty , Luo Si , Lidong Bing

On the Domain Adaptation and Generalization of Pretrained Language Models: A Survey

Recent advances in NLP are brought by a range of large-scale pretrained language models (PLMs). These PLMs have brought significant performance gains for a range of NLP tasks, circumventing the need to customize complex designs for specific…

Computation and Language · Computer Science 2022-11-08 Xu Guo , Han Yu

Large Product Key Memory for Pretrained Language Models

Product key memory (PKM) proposed by Lample et al. (2019) enables to improve prediction accuracy by increasing model capacity efficiently with insignificant computational overhead. However, their empirical application is only limited to…

Computation and Language · Computer Science 2020-10-09 Gyuwan Kim , Tae-Hwan Jung

Incorporating Priors with Feature Attribution on Text Classification

Feature attribution methods, proposed recently, help users interpret the predictions of complex models. Our approach integrates feature attributions into the objective function to allow machine learning practitioners to incorporate priors…

Computation and Language · Computer Science 2019-06-21 Frederick Liu , Besim Avci

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

With the continuous growth in the number of parameters of transformer-based pretrained language models (PLMs), particularly the emergence of large language models (LLMs) with billions of parameters, many natural language processing (NLP)…

Computation and Language · Computer Science 2023-12-20 Lingling Xu , Haoran Xie , Si-Zhao Joe Qin , Xiaohui Tao , Fu Lee Wang

Systematic Analysis for Pretrained Language Model Priming for Parameter-Efficient Fine-tuning

Parameter-efficient (PE) methods (like Prompts or Adapters) for adapting pre-trained language models (PLM) to downstream tasks have been popular recently. However, hindrances still prevent these methods from reaching their full potential.…

Computation and Language · Computer Science 2024-05-31 Shih-Cheng Huang , Shih-Heng Wang , Min-Han Shih , Saurav Sahay , Hung-yi Lee

Parameter-Efficient Tuning by Manipulating Hidden States of Pretrained Language Models For Classification Tasks

Parameter-efficient tuning aims to distill knowledge for downstream tasks by optimizing a few introduced parameters while freezing the pretrained language models (PLMs). Continuous prompt tuning which prepends a few trainable vectors to the…

Computation and Language · Computer Science 2022-04-14 Haoran Yang , Piji Li , Wai Lam

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities. However, controlling attributes of the generated language (e.g. switching topic or sentiment) is difficult without…

Computation and Language · Computer Science 2020-03-04 Sumanth Dathathri , Andrea Madotto , Janice Lan , Jane Hung , Eric Frank , Piero Molino , Jason Yosinski , Rosanne Liu

Towards Efficient Active Learning in NLP via Pretrained Representations

Fine-tuning Large Language Models (LLMs) is now a common approach for text classification in a wide range of applications. When labeled documents are scarce, active learning helps save annotation efforts but requires retraining of massive…

Machine Learning · Computer Science 2024-02-27 Artem Vysogorets , Achintya Gopal

Mitigating Data Scarcity for Large Language Models

In recent years, pretrained neural language models (PNLMs) have taken the field of natural language processing by storm, achieving new benchmarks and state-of-the-art performances. These models often rely heavily on annotated data, which…

Computation and Language · Computer Science 2023-02-06 Hoang Van

Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation

Large pretrained language models (PLMs) are often domain- or task-adapted via fine-tuning or prompting. Finetuning requires modifying all of the parameters and having enough data to avoid overfitting while prompting requires no training and…

Computation and Language · Computer Science 2022-07-11 Zejiang Hou , Julian Salazar , George Polovets

Parameter Efficient Tuning Allows Scalable Personalization of LLMs for Text Entry: A Case Study on Abbreviation Expansion

Abbreviation expansion is a strategy used to speed up communication by limiting the amount of typing and using a language model to suggest expansions. Here we look at personalizing a Large Language Model's (LLM) suggestions based on prior…

Computation and Language · Computer Science 2023-12-25 Katrin Tomanek , Shanqing Cai , Subhashini Venugopalan

Enhancing E-Commerce Recommendation using Pre-Trained Language Model and Fine-Tuning

Pretrained Language Models (PLM) have been greatly successful on a board range of natural language processing (NLP) tasks. However, it has just started being applied to the domain of recommendation systems. Traditional recommendation…

Machine Learning · Computer Science 2023-02-10 Nuofan Xu , Chenhui Hu

Optimising Language Models for Downstream Tasks: A Post-Training Perspective

Language models (LMs) have demonstrated remarkable capabilities in NLP, yet adapting them efficiently and robustly to specific tasks remains challenging. As their scale and complexity grow, fine-tuning LMs on labelled data often…

Computation and Language · Computer Science 2025-06-27 Zhengyan Shi

Scattered or Connected? An Optimized Parameter-efficient Tuning Approach for Information Retrieval

Pre-training and fine-tuning have achieved significant advances in the information retrieval (IR). A typical approach is to fine-tune all the parameters of large-scale pre-trained models (PTMs) on downstream tasks. As the model size and the…

Information Retrieval · Computer Science 2022-08-23 Xinyu Ma , Jiafeng Guo , Ruqing Zhang , Yixing Fan , Xueqi Cheng

Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective

Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual…

Computation and Language · Computer Science 2025-03-11 You Zhang , Jin Wang , Liang-Chih Yu , Dan Xu , Xuejie Zhang