Related papers: Compressing Large-Scale Transformer-Based Models: …

Exploring Extreme Parameter Compression for Pre-trained Language Models

Recent work explored the potential of large-scale Transformer-based pre-trained models, especially Pre-trained Language Models (PLMs) in natural language processing. This raises many concerns from various perspectives, e.g., financial costs…

Computation and Language · Computer Science 2022-05-23 Yuxin Ren , Benyou Wang , Lifeng Shang , Xin Jiang , Qun Liu

A Primer in BERTology: What we know about how BERT works

Transformer-based models have pushed state of the art in many areas of NLP, but our understanding of what is behind their success is still limited. This paper is the first survey of over 150 studies of the popular BERT model. We review the…

Computation and Language · Computer Science 2020-11-10 Anna Rogers , Olga Kovaleva , Anna Rumshisky

Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey

Large, pre-trained transformer-based language models such as BERT have drastically changed the Natural Language Processing (NLP) field. We present a survey of recent work that uses these large language models to solve NLP tasks via…

Computation and Language · Computer Science 2021-11-03 Bonan Min , Hayley Ross , Elior Sulem , Amir Pouran Ben Veyseh , Thien Huu Nguyen , Oscar Sainz , Eneko Agirre , Ilana Heinz , Dan Roth

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

While pre-trained language models (e.g., BERT) have achieved impressive results on different natural language processing tasks, they have large numbers of parameters and suffer from big computational and memory costs, which make them…

Computation and Language · Computer Science 2021-06-01 Jin Xu , Xu Tan , Renqian Luo , Kaitao Song , Jian Li , Tao Qin , Tie-Yan Liu

Q8BERT: Quantized 8Bit BERT

Recently, pre-trained Transformer based language models such as BERT and GPT, have shown great improvement in many Natural Language Processing (NLP) tasks. However, these models contain a large amount of parameters. The emergence of even…

Computation and Language · Computer Science 2021-12-20 Ofir Zafrir , Guy Boudoukh , Peter Izsak , Moshe Wasserblat

Compression of Deep Learning Models for Text: A Survey

In recent years, the fields of natural language processing (NLP) and information retrieval (IR) have made tremendous progress thanksto deep learning models like Recurrent Neural Networks (RNNs), Gated Recurrent Units (GRUs) and Long…

Computation and Language · Computer Science 2021-06-15 Manish Gupta , Puneet Agrawal

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer…

Computation and Language · Computer Science 2020-02-11 Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , Radu Soricut

A Survey on Model Compression and Acceleration for Pretrained Language Models

Despite achieving state-of-the-art performance on many NLP tasks, the high energy cost and long inference delay prevent Transformer-based pretrained language models (PLMs) from seeing broader adoption including for edge and mobile…

Computation and Language · Computer Science 2022-11-30 Canwen Xu , Julian McAuley

Prune Once for All: Sparse Pre-Trained Language Models

Transformer-based language models are applied to a wide range of applications in natural language processing. However, they are inefficient and difficult to deploy. In recent years, many compression algorithms have been proposed to increase…

Computation and Language · Computer Science 2021-11-11 Ofir Zafrir , Ariel Larey , Guy Boudoukh , Haihao Shen , Moshe Wasserblat

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting,…

Computation and Language · Computer Science 2020-06-24 Zhuohan Li , Eric Wallace , Sheng Shen , Kevin Lin , Kurt Keutzer , Dan Klein , Joseph E. Gonzalez

Advancements in Natural Language Processing: Exploring Transformer-Based Architectures for Text Understanding

Natural Language Processing (NLP) has witnessed a transformative leap with the advent of transformer-based architectures, which have significantly enhanced the ability of machines to understand and generate human-like text. This paper…

Computation and Language · Computer Science 2025-03-27 Tianhao Wu , Yu Wang , Ngoc Quach

A Survey on Model Compression for Large Language Models

Large Language Models (LLMs) have transformed natural language processing tasks successfully. Yet, their large size and high computational needs pose challenges for practical use, especially in resource-limited settings. Model compression…

Computation and Language · Computer Science 2024-07-31 Xunyu Zhu , Jian Li , Yong Liu , Can Ma , Weiping Wang

A Comprehensive Comparison of Pre-training Language Models

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of…

Computation and Language · Computer Science 2023-07-27 Tong Guo

Structured Pruning of a BERT-based Question Answering Model

The recent trend in industry-setting Natural Language Processing (NLP) research has been to operate large %scale pretrained language models like BERT under strict computational limits. While most model compression work has focused on…

Computation and Language · Computer Science 2021-04-13 J. S. McCarley , Rishav Chakravarti , Avirup Sil

Robust Transfer Learning with Pretrained Language Models through Adapters

Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific…

Computation and Language · Computer Science 2021-08-06 Wenjuan Han , Bo Pang , Yingnian Wu

The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures

In recent years, Natural Language Processing (NLP) models have achieved phenomenal success in linguistic and semantic tasks like text classification, machine translation, cognitive dialogue systems, information retrieval via Natural…

Computation and Language · Computer Science 2021-05-18 Sushant Singh , Ausif Mahmood

Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT

Large pre-trained language models have recently gained significant traction due to their improved performance on various down-stream tasks like text classification and question answering, requiring only few epochs of fine-tuning. However,…

Computation and Language · Computer Science 2023-09-01 Souvik Kundu , Sharath Nittur Sridhar , Maciej Szankin , Sairam Sundaresan

A Short Study on Compressing Decoder-Based Language Models

Pre-trained Language Models (PLMs) have been successful for a wide range of natural language processing (NLP) tasks. The state-of-the-art of PLMs, however, are extremely large to be used on edge devices. As a result, the topic of model…

Computation and Language · Computer Science 2021-10-19 Tianda Li , Yassir El Mesbahi , Ivan Kobyzev , Ahmad Rashid , Atif Mahmud , Nithin Anchuri , Habib Hajimolahoseini , Yang Liu , Mehdi Rezagholizadeh

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

Transformer-based language models have become a key building block for natural language processing. While these models are extremely accurate, they can be too large and computationally intensive to run on standard deployments. A variety of…

Computation and Language · Computer Science 2022-10-19 Eldar Kurtic , Daniel Campos , Tuan Nguyen , Elias Frantar , Mark Kurtz , Benjamin Fineran , Michael Goin , Dan Alistarh

A Survey on Transformer Compression

Transformer plays a vital role in the realms of natural language processing (NLP) and computer vision (CV), specially for constructing large language models (LLM) and large vision models (LVM). Model compression methods reduce the memory…

Machine Learning · Computer Science 2024-04-09 Yehui Tang , Yunhe Wang , Jianyuan Guo , Zhijun Tu , Kai Han , Hailin Hu , Dacheng Tao