Related papers: Compressing Transformer-Based Semantic Parsing Mod…

Exploring Extreme Parameter Compression for Pre-trained Language Models

Recent work explored the potential of large-scale Transformer-based pre-trained models, especially Pre-trained Language Models (PLMs) in natural language processing. This raises many concerns from various perspectives, e.g., financial costs…

Computation and Language · Computer Science 2022-05-23 Yuxin Ren , Benyou Wang , Lifeng Shang , Xin Jiang , Qun Liu

ROSITA: Refined BERT cOmpreSsion with InTegrAted techniques

Pre-trained language models of the BERT family have defined the state-of-the-arts in a wide range of NLP tasks. However, the performance of BERT-based models is mainly driven by the enormous amount of parameters, which hinders their…

Computation and Language · Computer Science 2021-03-23 Yuanxin Liu , Zheng Lin , Fengcheng Yuan

iBERT: Interpretable Embeddings via Sense Decomposition

We present iBERT (interpretable-BERT), an encoder to produce inherently interpretable and controllable embeddings - designed to modularize and expose the discriminative cues present in language, such as semantic or stylistic structure. Each…

Computation and Language · Computer Science 2026-01-27 Vishal Anand , Milad Alshomary , Kathleen McKeown

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

Pre-trained Transformer-based models have achieved state-of-the-art performance for various Natural Language Processing (NLP) tasks. However, these models often have billions of parameters, and, thus, are too resource-hungry and…

Machine Learning · Computer Science 2021-09-29 Prakhar Ganesh , Yao Chen , Xin Lou , Mohammad Ali Khan , Yin Yang , Hassan Sajjad , Preslav Nakov , Deming Chen , Marianne Winslett

Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings

Models based on the transformer architecture, such as BERT, have marked a crucial step forward in the field of Natural Language Processing. Importantly, they allow the creation of word embeddings that capture important semantic information…

Computation and Language · Computer Science 2021-01-01 Jacob Turton , David Vinson , Robert Elliott Smith

Establishing Strong Baselines for the New Decade: Sequence Tagging, Syntactic and Semantic Parsing with BERT

This paper presents new state-of-the-art models for three tasks, part-of-speech tagging, syntactic parsing, and semantic parsing, using the cutting-edge contextualized embedding framework known as BERT. For each task, we first replicate and…

Computation and Language · Computer Science 2020-05-26 Han He , Jinho D. Choi

Extracting Sentence Embeddings from Pretrained Transformer Models

Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in…

Computation and Language · Computer Science 2025-02-21 Lukas Stankevičius , Mantas Lukoševičius

RefBERT: Compressing BERT by Referencing to Pre-computed Representations

Recently developed large pre-trained language models, e.g., BERT, have achieved remarkable performance in many downstream natural language processing applications. These pre-trained language models often contain hundreds of millions of…

Computation and Language · Computer Science 2021-06-17 Xinyi Wang , Haiqin Yang , Liang Zhao , Yang Mo , Jianping Shen

SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features

Models based on large-pretrained language models, such as S(entence)BERT, provide effective and efficient sentence embeddings that show high correlation to human similarity ratings, but lack interpretability. On the other hand, graph…

Computation and Language · Computer Science 2025-10-17 Juri Opitz , Anette Frank

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

Increasing model size when pretraining natural language representations often results in improved performance on downstream tasks. However, at some point further model increases become harder due to GPU/TPU memory limitations and longer…

Computation and Language · Computer Science 2020-02-11 Zhenzhong Lan , Mingda Chen , Sebastian Goodman , Kevin Gimpel , Piyush Sharma , Radu Soricut

Semantics-aware BERT for Language Understanding

The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference…

Computation and Language · Computer Science 2020-02-05 Zhuosheng Zhang , Yuwei Wu , Hai Zhao , Zuchao Li , Shuailiang Zhang , Xi Zhou , Xiang Zhou

Transition-based Abstract Meaning Representation Parsing with Contextual Embeddings

The ability to understand and generate languages sets human cognition apart from other known life forms'. We study a way of combing two of the most successful routes to meaning of language--statistical language models and symbolic semantics…

Computation and Language · Computer Science 2022-06-14 Yichao Liang

NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

While pre-trained language models (e.g., BERT) have achieved impressive results on different natural language processing tasks, they have large numbers of parameters and suffer from big computational and memory costs, which make them…

Computation and Language · Computer Science 2021-06-01 Jin Xu , Xu Tan , Renqian Luo , Kaitao Song , Jian Li , Tao Qin , Tie-Yan Liu

LadaBERT: Lightweight Adaptation of BERT through Hybrid Model Compression

BERT is a cutting-edge language representation model pre-trained by a large corpus, which achieves superior performances on various natural language understanding tasks. However, a major blocking issue of applying BERT to online services is…

Computation and Language · Computer Science 2020-10-22 Yihuan Mao , Yujing Wang , Chufan Wu , Chen Zhang , Yang Wang , Yaming Yang , Quanlu Zhang , Yunhai Tong , Jing Bai

Text Summarization with Pretrained Encoders

Bidirectional Encoder Representations from Transformers (BERT) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks. In this paper, we showcase how…

Computation and Language · Computer Science 2019-09-06 Yang Liu , Mirella Lapata

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search

Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks. However, the huge parameter size makes them difficult to be deployed in real-time applications that require quick…

Computation and Language · Computer Science 2021-01-25 Daoyuan Chen , Yaliang Li , Minghui Qiu , Zhen Wang , Bofang Li , Bolin Ding , Hongbo Deng , Jun Huang , Wei Lin , Jingren Zhou

SEE: Sememe Entanglement Encoding for Transformer-bases Models Compression

Transformer-based large language models exhibit groundbreaking capabilities, but their storage and computational costs are prohibitively high, limiting their application in resource-constrained scenarios. An effective approach is to…

Machine Learning · Computer Science 2024-12-18 Jing Zhang , Shuzhen Sun , Peng Zhang , Guangxing Cao , Hui Gao , Xindian Ma , Nan Xu , Yuexian Hou

Domain Lexical Knowledge-based Word Embedding Learning for Text Classification under Small Data

Pre-trained language models such as BERT have been proved to be powerful in many natural language processing tasks. But in some text classification applications such as emotion recognition and sentiment analysis, BERT may not lead to…

Computation and Language · Computer Science 2025-06-03 Zixiao Zhu , Kezhi Mao

Extremely Small BERT Models from Mixed-Vocabulary Training

Pretrained language models like BERT have achieved good results on NLP tasks, but are impractical on resource-limited devices due to memory footprint. A large fraction of this footprint comes from the input embeddings with large input…

Computation and Language · Computer Science 2021-02-09 Sanqiang Zhao , Raghav Gupta , Yang Song , Denny Zhou

Learning and Evaluating Contextual Embedding of Source Code

Recent research has achieved impressive results on understanding and improving source code by building up on machine-learning techniques developed for natural languages. A significant advancement in natural-language understanding has come…

Software Engineering · Computer Science 2020-08-19 Aditya Kanade , Petros Maniatis , Gogul Balakrishnan , Kensen Shi