Related papers: Towards Building Efficient Sentence BERT Models us…

On Importance of Layer Pruning for Smaller BERT Models and Low Resource Languages

This study explores the effectiveness of layer pruning for developing more efficient BERT models tailored to specific downstream tasks in low-resource languages. Our primary objective is to evaluate whether pruned BERT models can maintain…

Computation and Language · Computer Science 2025-01-03 Mayur Shirke , Amey Shembade , Madhushri Wagh , Pavan Thorat , Raviraj Joshi

Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks

Contextualized representations from a pre-trained language model are central to achieve a high performance on downstream NLP task. The pre-trained BERT and A Lite BERT (ALBERT) models can be fine-tuned to give state-ofthe-art results in…

Computation and Language · Computer Science 2021-01-27 Hyunjin Choi , Judong Kim , Seongho Joe , Youngjune Gwon

Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models

The pruning objective has recently extended beyond accuracy and sparsity to robustness in language models. Despite this, existing methods struggle to enhance robustness against adversarial attacks when continually increasing model sparsity…

Computation and Language · Computer Science 2024-01-12 Jianwei Li , Qi Lei , Wei Cheng , Dongkuan Xu

Structured Pruning of Large Language Models

Large language models have recently achieved state of the art performance across a wide variety of natural language tasks. Meanwhile, the size of these models and their latency have significantly increased, which makes their usage costly,…

Computation and Language · Computer Science 2021-03-30 Ziheng Wang , Jeremy Wohlwend , Tao Lei

SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models

Sentence embedding is an important research topic in natural language processing (NLP) since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word representation, called BERT, achieves the state-of-the-art…

Computation and Language · Computer Science 2020-06-02 Bin Wang , C. -C. Jay Kuo

Extracting Sentence Embeddings from Pretrained Transformer Models

Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in…

Computation and Language · Computer Science 2025-02-21 Lukas Stankevičius , Mantas Lukoševičius

On the Prunability of Attention Heads in Multilingual BERT

Large multilingual models, such as mBERT, have shown promise in crosslingual transfer. In this work, we employ pruning to quantify the robustness and interpret layer-wise importance of mBERT. On four GLUE tasks, the relative drops in…

Computation and Language · Computer Science 2021-09-28 Aakriti Budhraja , Madhura Pande , Pratyush Kumar , Mitesh M. Khapra

BERMo: What can BERT learn from ELMo?

We propose BERMo, an architectural modification to BERT, which makes predictions based on a hierarchy of surface, syntactic and semantic language features. We use linear combination scheme proposed in Embeddings from Language Models (ELMo)…

Computation and Language · Computer Science 2021-11-01 Sangamesh Kodge , Kaushik Roy

Structured Pruning of a BERT-based Question Answering Model

The recent trend in industry-setting Natural Language Processing (NLP) research has been to operate large %scale pretrained language models like BERT under strict computational limits. While most model compression work has focused on…

Computation and Language · Computer Science 2021-04-13 J. S. McCarley , Rishav Chakravarti , Avirup Sil

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

Self-supervised speech representation learning (SSL) has shown to be effective in various downstream tasks, but SSL models are usually large and slow. Model compression techniques such as pruning aim to reduce the model size and computation…

Computation and Language · Computer Science 2023-03-01 Yifan Peng , Kwangyoun Kim , Felix Wu , Prashant Sridhar , Shinji Watanabe

Compressing Sentence Representation with maximum Coding Rate Reduction

In most natural language inference problems, sentence representation is needed for semantic retrieval tasks. In recent years, pre-trained large language models have been quite effective for computing such representations. These models…

Computation and Language · Computer Science 2023-04-26 Domagoj Ševerdija , Tomislav Prusina , Antonio Jovanović , Luka Borozan , Jurica Maltar , Domagoj Matijević

EELBERT: Tiny Models through Dynamic Embeddings

We introduce EELBERT, an approach for compression of transformer-based models (e.g., BERT), with minimal impact on the accuracy of downstream tasks. This is achieved by replacing the input embedding layer of the model with dynamic, i.e.…

Computation and Language · Computer Science 2023-11-01 Gabrielle Cohn , Rishika Agarwal , Deepanshu Gupta , Siddharth Patwardhan

The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models

Transformer-based language models have become a key building block for natural language processing. While these models are extremely accurate, they can be too large and computationally intensive to run on standard deployments. A variety of…

Computation and Language · Computer Science 2022-10-19 Eldar Kurtic , Daniel Campos , Tuan Nguyen , Elias Frantar , Mark Kurtz , Benjamin Fineran , Michael Goin , Dan Alistarh

On the Sentence Embeddings from Pre-trained Language Models

Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture…

Computation and Language · Computer Science 2020-11-12 Bohan Li , Hao Zhou , Junxian He , Mingxuan Wang , Yiming Yang , Lei Li

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer…

Computation and Language · Computer Science 2020-02-11 Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , Radu Soricut

Rethinking Network Pruning -- under the Pre-train and Fine-tune Paradigm

Transformer-based pre-trained language models have significantly improved the performance of various natural language processing (NLP) tasks in the recent years. While effective and prevalent, these models are usually prohibitively large…

Computation and Language · Computer Science 2022-01-19 Dongkuan Xu , Ian E. H. Yen , Jinxi Zhao , Zhibin Xiao

Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning

Pre-trained universal feature extractors, such as BERT for natural language processing and VGG for computer vision, have become effective methods for improving deep learning models without requiring more labeled data. While effective,…

Computation and Language · Computer Science 2020-05-18 Mitchell A. Gordon , Kevin Duh , Nicholas Andrews

Efficient Fine-Tuning of Compressed Language Models with Learners

Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the…

Computation and Language · Computer Science 2022-08-04 Danilo Vucetic , Mohammadreza Tayaranian , Maryam Ziaeefard , James J. Clark , Brett H. Meyer , Warren J. Gross

Extremely Small BERT Models from Mixed-Vocabulary Training

Pretrained language models like BERT have achieved good results on NLP tasks, but are impractical on resource-limited devices due to memory footprint. A large fraction of this footprint comes from the input embeddings with large input…

Computation and Language · Computer Science 2021-02-09 Sanqiang Zhao , Raghav Gupta , Yang Song , Denny Zhou

Query Embedding Pruning for Dense Retrieval

Recent advances in dense retrieval techniques have offered the promise of being able not just to re-rank documents using contextualised language models such as BERT, but also to use such models to identify documents from the collection in…

Information Retrieval · Computer Science 2021-08-25 Nicola Tonellotto , Craig Macdonald