English
Related papers

Related papers: DeeBERT: Dynamic Early Exiting for Accelerating BE…

200 papers

Large-scale pre-trained language models such as BERT have contributed significantly to the development of NLP. However, those models require large computational resources, making it difficult to be applied to mobile devices where computing…

Computation and Language · Computer Science 2023-08-02 Weixin Wu , Hankz Hankui Zhuo

Despite the great success in Natural Language Processing (NLP) area, large pre-trained language models like BERT are not well-suited for resource-constrained or real-time applications owing to the large number of parameters and slow…

Computation and Language · Computer Science 2021-07-02 Keli Xie , Siyuan Lu , Meiqi Wang , Zhongfeng Wang

BERT has achieved superior performances on Natural Language Understanding (NLU) tasks. However, BERT possesses a large number of parameters and demands certain resources to deploy. For acceleration, Dynamic Early Exiting for BERT (DeeBERT)…

Computation and Language · Computer Science 2021-01-26 Shijie Geng , Peng Gao , Zuohui Fu , Yongfeng Zhang

Pre-trained Language Models (PLMs), like BERT, with self-supervision objectives exhibit remarkable performance and generalization across various tasks. However, they suffer in inference latency due to their large size. To address this…

Computation and Language · Computer Science 2024-05-27 Divya Jyoti Bajpai , Manjesh Kumar Hanawal

In this paper, we propose Patience-based Early Exit, a straightforward yet effective inference method that can be used as a plug-and-play technique to simultaneously improve the efficiency and robustness of a pretrained language model…

Computation and Language · Computer Science 2020-10-23 Wangchunshu Zhou , Canwen Xu , Tao Ge , Julian McAuley , Ke Xu , Furu Wei

Dynamic early exiting has been proven to improve the inference speed of the pre-trained language model like BERT. However, all samples must go through all consecutive layers before early exiting and more complex samples usually go through…

Computation and Language · Computer Science 2023-05-09 Boren Hu , Yun Zhu , Jiacheng Li , Siliang Tang

Transformer-based language models such as BERT provide significant accuracy improvement for a multitude of natural language processing (NLP) tasks. However, their hefty computational and memory demands make them challenging to deploy to…

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains…

Computation and Language · Computer Science 2020-03-03 Victor Sanh , Lysandre Debut , Julien Chaumond , Thomas Wolf

As NLP models become larger, executing a trained model requires significant computational resources incurring monetary and environmental costs. To better respect a given inference budget, we propose a modification to contextual…

Computation and Language · Computer Science 2020-05-12 Roy Schwartz , Gabriel Stanovsky , Swabha Swayamdipta , Jesse Dodge , Noah A. Smith

Limited computational budgets often prevent transformers from being used in production and from having their high accuracy utilized. TinyBERT addresses the computational efficiency by self-distilling BERT into a smaller transformer…

Computation and Language · Computer Science 2021-11-19 Shira Guskin , Moshe Wasserblat , Ke Ding , Gyuwan Kim

Early exiting has demonstrated its effectiveness in accelerating the inference of pre-trained language models like BERT by dynamically adjusting the number of layers executed. However, most existing early exiting methods only consider local…

Machine Learning · Computer Science 2025-12-30 Jianing He , Qi Zhang , Weiping Ding , Duoqian Miao , Jun Zhao , Liang Hu , Longbing Cao

Large-scale pre-trained language models have shown remarkable results in diverse NLP applications. Unfortunately, these performance gains have been accompanied by a significant increase in computation time and model size, stressing the need…

Computation and Language · Computer Science 2021-09-27 Cristóbal Eyzaguirre , Felipe del Río , Vladimir Araujo , Álvaro Soto

With the development of deep learning and Transformer-based pre-trained models like BERT, the accuracy of many NLP tasks has been dramatically improved. However, the large number of parameters and computations also pose challenges for their…

Computation and Language · Computer Science 2022-12-07 Siyuan Lu , Chenchen Zhou , Keli Xie , Jun Lin , Zhongfeng Wang

Pre-trained language models like BERT have proven to be highly performant. However, they are often computationally expensive in many practical scenarios, for such heavy models can hardly be readily implemented with limited resources. To…

Computation and Language · Computer Science 2020-04-30 Weijie Liu , Peng Zhou , Zhe Zhao , Zhiruo Wang , Haotang Deng , Qi Ju

Pre-training with self-supervised models, such as Hidden-unit BERT (HuBERT) and wav2vec 2.0, has brought significant improvements in automatic speech recognition (ASR). However, these models usually require an expensive computational cost…

Computation and Language · Computer Science 2024-06-21 Ji Won Yoon , Beom Jun Woo , Nam Soo Kim

Dynamic early exiting aims to accelerate the inference of pre-trained language models (PLMs) by emitting predictions in internal layers without passing through the entire model. In this paper, we empirically analyze the working mechanism of…

Computation and Language · Computer Science 2021-09-06 Lei Li , Yankai Lin , Deli Chen , Shuhuai Ren , Peng Li , Jie Zhou , Xu Sun

Existing pre-trained language models (PLMs) are often computationally expensive in inference, making them impractical in various resource-limited real-world applications. To address this issue, we propose a dynamic token reduction approach…

Computation and Language · Computer Science 2021-05-26 Deming Ye , Yankai Lin , Yufei Huang , Maosong Sun

Heavily overparameterized language models such as BERT, XLNet and T5 have achieved impressive success in many NLP tasks. However, their high model complexity requires enormous computation resources and extremely long training time for both…

Computation and Language · Computer Science 2021-06-09 Xiaohan Chen , Yu Cheng , Shuohang Wang , Zhe Gan , Zhangyang Wang , Jingjing Liu

Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently…

Computation and Language · Computer Science 2020-10-19 Xiaoqi Jiao , Yichun Yin , Lifeng Shang , Xin Jiang , Xiao Chen , Linlin Li , Fang Wang , Qun Liu

Currently, pre-trained models can be considered the default choice for a wide range of NLP tasks. Despite their SoTA results, there is practical evidence that these models may require a different number of computing layers for different…

Machine Learning · Computer Science 2023-05-19 Nikita Balagansky , Daniil Gavrilov
‹ Prev 1 2 3 10 Next ›