Related papers: StructBERT: Incorporating Language Structures into…

BERTSel: Answer Selection with Pre-trained Models

Recently, pre-trained models have been the dominant paradigm in natural language processing. They achieved remarkable state-of-the-art performance across a wide range of related tasks, such as textual entailment, natural language inference,…

Computation and Language · Computer Science 2019-05-21 Dongfang Li , Yifei Yu , Qingcai Chen , Xinyu Li

MathBERT: A Pre-Trained Model for Mathematical Formula Understanding

Large-scale pre-trained models like BERT, have obtained a great success in various Natural Language Processing (NLP) tasks, while it is still a challenge to adapt them to the math-related tasks. Current pre-trained models neglect the…

Computation and Language · Computer Science 2021-05-04 Shuai Peng , Ke Yuan , Liangcai Gao , Zhi Tang

Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks

Contextualized representations from a pre-trained language model are central to achieve a high performance on downstream NLP task. The pre-trained BERT and A Lite BERT (ALBERT) models can be fine-tuned to give state-ofthe-art results in…

Computation and Language · Computer Science 2021-01-27 Hyunjin Choi , Judong Kim , Seongho Joe , Youngjune Gwon

ST-BERT: Cross-modal Language Model Pre-training For End-to-end Spoken Language Understanding

Language model pre-training has shown promising results in various downstream tasks. In this context, we introduce a cross-modal pre-trained language model, called Speech-Text BERT (ST-BERT), to tackle end-to-end spoken language…

Computation and Language · Computer Science 2021-04-13 Minjeong Kim , Gyuwan Kim , Sang-Woo Lee , Jung-Woo Ha

Unsupervised Pre-training with Structured Knowledge for Improving Natural Language Inference

While recent research on natural language inference has considerably benefited from large annotated datasets, the amount of inference-related knowledge (including commonsense) provided in the annotated data is still rather limited. There…

Computation and Language · Computer Science 2021-09-10 Xiaoyu Yang , Xiaodan Zhu , Zhan Shi , Tianda Li

CERT: Contrastive Self-supervised Learning for Language Understanding

Pretrained language models such as BERT, GPT have shown great effectiveness in language understanding. The auxiliary predictive tasks in existing pretraining approaches are mostly defined on tokens, thus may not be able to capture…

Computation and Language · Computer Science 2020-06-19 Hongchao Fang , Sicheng Wang , Meng Zhou , Jiayuan Ding , Pengtao Xie

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional…

Computation and Language · Computer Science 2019-05-28 Jacob Devlin , Ming-Wei Chang , Kenton Lee , Kristina Toutanova

AMBERT: A Pre-trained Language Model with Multi-Grained Tokenization

Pre-trained language models such as BERT have exhibited remarkable performances in many tasks in natural language understanding (NLU). The tokens in the models are usually fine-grained in the sense that for languages like English they are…

Computation and Language · Computer Science 2021-05-28 Xinsong Zhang , Pengshuai Li , Hang Li

Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks

Pretraining sentence encoders with language modeling and related unsupervised tasks has recently been shown to be very effective for language understanding tasks. By supplementing language model-style pretraining with further training on…

Computation and Language · Computer Science 2019-03-01 Jason Phang , Thibault Févry , Samuel R. Bowman

A New Sentence Ordering Method Using BERT Pretrained Model

Building systems with capability of natural language understanding (NLU) has been one of the oldest areas of AI. An essential component of NLU is to detect logical succession of events contained in a text. The task of sentence ordering is…

Computation and Language · Computer Science 2021-08-30 Melika Golestani , Seyedeh Zahra Razavi , Heshaam Faili

RobBERT: a Dutch RoBERTa-based Language Model

Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks. One of the most prominent pre-trained…

Computation and Language · Computer Science 2020-09-17 Pieter Delobelle , Thomas Winters , Bettina Berendt

FlauBERT: Unsupervised Language Model Pre-training for French

Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks. Leveraging the huge amount of unlabeled texts nowadays available, they provide an efficient way to…

Computation and Language · Computer Science 2020-03-16 Hang Le , Loïc Vial , Jibril Frej , Vincent Segonne , Maximin Coavoux , Benjamin Lecouteux , Alexandre Allauzen , Benoît Crabbé , Laurent Besacier , Didier Schwab

MedicalBERT: enhancing biomedical natural language processing using pretrained BERT-based model

Recent advances in natural language processing (NLP) have been driven bypretrained language models like BERT, RoBERTa, T5, and GPT. Thesemodels excel at understanding complex texts, but biomedical literature, withits domain-specific…

Computation and Language · Computer Science 2025-07-28 K. Sahit Reddy , N. Ragavenderan , Vasanth K. , Ganesh N. Naik , Vishalakshi Prabhu , Nagaraja G. S

Efficient Fine-Tuning of Compressed Language Models with Learners

Fine-tuning BERT-based models is resource-intensive in memory, computation, and time. While many prior works aim to improve inference efficiency via compression techniques, e.g., pruning, these works do not explicitly address the…

Computation and Language · Computer Science 2022-08-04 Danilo Vucetic , Mohammadreza Tayaranian , Maryam Ziaeefard , James J. Clark , Brett H. Meyer , Warren J. Gross

Visualizing and Understanding the Effectiveness of BERT

Language model pre-training, such as BERT, has achieved remarkable results in many NLP tasks. However, it is unclear why the pre-training-then-fine-tuning paradigm can improve performance and generalization capability across different…

Computation and Language · Computer Science 2019-08-16 Yaru Hao , Li Dong , Furu Wei , Ke Xu

Syntax-BERT: Improving Pre-trained Transformers with Syntax Trees

Pre-trained language models like BERT achieve superior performances in various NLP tasks without explicit consideration of syntactic information. Meanwhile, syntactic information has been proved to be crucial for the success of NLP…

Computation and Language · Computer Science 2021-03-09 Jiangang Bai , Yujing Wang , Yiren Chen , Yaming Yang , Jing Bai , Jing Yu , Yunhai Tong

Structured Pruning of a BERT-based Question Answering Model

The recent trend in industry-setting Natural Language Processing (NLP) research has been to operate large %scale pretrained language models like BERT under strict computational limits. While most model compression work has focused on…

Computation and Language · Computer Science 2021-04-13 J. S. McCarley , Rishav Chakravarti , Avirup Sil

ESIE-BERT: Enriching Sub-words Information Explicitly with BERT for Joint Intent Classification and SlotFilling

Natural language understanding (NLU) has two core tasks: intent classification and slot filling. The success of pre-training language models resulted in a significant breakthrough in the two tasks. One of the promising solutions called BERT…

Computation and Language · Computer Science 2023-02-03 Yu Guo , Zhilong Xie , Xingyan Chen , Huangen Chen , Leilei Wang , Huaming Du , Shaopeng Wei , Yu Zhao , Qing Li , Gang Wu

A Comprehensive Comparison of Pre-training Language Models

Recently, the development of pre-trained language models has brought natural language processing (NLP) tasks to the new state-of-the-art. In this paper we explore the efficiency of various pre-trained language models. We pre-train a list of…

Computation and Language · Computer Science 2023-07-27 Tong Guo

SciBERT: A Pretrained Language Model for Scientific Text

Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et al., 2018) to address the lack of high-quality, large-scale…

Computation and Language · Computer Science 2019-09-12 Iz Beltagy , Kyle Lo , Arman Cohan