Related papers: VisBERT: Hidden-State Visualizations for Transform…

How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations

Bidirectional Encoder Representations from Transformers (BERT) reach state-of-the-art results in a variety of Natural Language Processing tasks. However, understanding of their internal functioning is still insufficient and unsatisfactory.…

Computation and Language · Computer Science 2019-09-12 Betty van Aken , Benjamin Winter , Alexander Löser , Felix A. Gers

exBERT: A Visual Analysis Tool to Explore Learned Representations in Transformers Models

Large language models can produce powerful contextual representations that lead to improvements across many NLP tasks. Since these models are typically guided by a sequence of learned self attention mechanisms and may comprise undesired…

Computation and Language · Computer Science 2019-10-14 Benjamin Hoover , Hendrik Strobelt , Sebastian Gehrmann

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional…

Computation and Language · Computer Science 2019-05-28 Jacob Devlin , Ming-Wei Chang , Kenton Lee , Kristina Toutanova

VL-InterpreT: An Interactive Visualization Tool for Interpreting Vision-Language Transformers

Breakthroughs in transformer-based models have revolutionized not only the NLP field, but also vision and multimodal systems. However, although visualization and interpretability tools have become available for NLP models, internal…

Computer Vision and Pattern Recognition · Computer Science 2022-08-24 Estelle Aflalo , Meng Du , Shao-Yen Tseng , Yongfei Liu , Chenfei Wu , Nan Duan , Vasudev Lal

BrainBERT: Self-supervised representation learning for intracranial recordings

We create a reusable Transformer, BrainBERT, for intracranial recordings bringing modern representation learning approaches to neuroscience. Much like in NLP and speech recognition, this Transformer enables classifying complex concepts,…

Machine Learning · Computer Science 2023-03-01 Christopher Wang , Vighnesh Subramaniam , Adam Uri Yaari , Gabriel Kreiman , Boris Katz , Ignacio Cases , Andrei Barbu

HUBERT Untangles BERT to Improve Transfer across NLP Tasks

We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP…

Computation and Language · Computer Science 2021-04-27 Mehrad Moradshahi , Hamid Palangi , Monica S. Lam , Paul Smolensky , Jianfeng Gao

BERT's output layer recognizes all hidden layers? Some Intriguing Phenomena and a simple way to boost BERT

Although Bidirectional Encoder Representations from Transformers (BERT) have achieved tremendous success in many natural language processing (NLP) tasks, it remains a black box. A variety of previous works have tried to lift the veil of…

Computation and Language · Computer Science 2021-02-16 Wei-Tsung Kao , Tsung-Han Wu , Po-Han Chi , Chun-Cheng Hsieh , Hung-Yi Lee

iBERT: Interpretable Embeddings via Sense Decomposition

We present iBERT (interpretable-BERT), an encoder to produce inherently interpretable and controllable embeddings - designed to modularize and expose the discriminative cues present in language, such as semantic or stylistic structure. Each…

Computation and Language · Computer Science 2026-01-27 Vishal Anand , Milad Alshomary , Kathleen McKeown

Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges

Recent years have witnessed a substantial increase in the use of deep learning to solve various natural language processing (NLP) problems. Early deep learning models were constrained by their sequential or unidirectional nature, such that…

Information Retrieval · Computer Science 2024-03-05 Jiajia Wang , Jimmy X. Huang , Xinhui Tu , Junmei Wang , Angela J. Huang , Md Tahmid Rahman Laskar , Amran Bhuiyan

TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding

Bidirectional Encoder Representations from Transformers (BERT) has recently achieved state-of-the-art performance on a broad range of NLP tasks including sentence classification, machine translation, and question answering. The BERT model…

Computation and Language · Computer Science 2020-03-17 Zhiheng Huang , Peng Xu , Davis Liang , Ajay Mishra , Bing Xiang

Stacked DeBERT: All Attention in Incomplete Data for Text Classification

In this paper, we propose Stacked DeBERT, short for Stacked Denoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel…

Computation and Language · Computer Science 2021-01-15 Gwenaelle Cunha Sergio , Minho Lee

Transformer visualization via dictionary learning: contextualized embedding as a linear superposition of transformer factors

Transformer networks have revolutionized NLP representation learning since they were introduced. Though a great effort has been made to explain the representation in transformers, it is widely recognized that our understanding is not…

Computation and Language · Computer Science 2023-04-05 Zeyu Yun , Yubei Chen , Bruno A Olshausen , Yann LeCun

ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks

We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-agnostic joint representations of image content and natural language. We extend the popular BERT architecture to a multi-modal two-stream model, pro-cessing…

Computer Vision and Pattern Recognition · Computer Science 2019-08-07 Jiasen Lu , Dhruv Batra , Devi Parikh , Stefan Lee

Which Features are Learned by CodeBert: An Empirical Study of the BERT-based Source Code Representation Learning

The Bidirectional Encoder Representations from Transformers (BERT) were proposed in the natural language process (NLP) and shows promising results. Recently researchers applied the BERT to source-code representation learning and reported…

Computation and Language · Computer Science 2023-08-14 Lan Zhang , Chen Cao , Zhilong Wang , Peng Liu

Text Summarization with Pretrained Encoders

Bidirectional Encoder Representations from Transformers (BERT) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks. In this paper, we showcase how…

Computation and Language · Computer Science 2019-09-06 Yang Liu , Mirella Lapata

Hierarchical Transformers for Long Document Classification

BERT, which stands for Bidirectional Encoder Representations from Transformers, is a recently introduced language representation model based upon the transfer learning paradigm. We extend its fine-tuning procedure to address one of its…

Computation and Language · Computer Science 2019-10-25 Raghavendra Pappagari , Piotr Żelasko , Jesús Villalba , Yishay Carmiel , Najim Dehak

What the [MASK]? Making Sense of Language-Specific BERT Models

Recently, Natural Language Processing (NLP) has witnessed an impressive progress in many areas, due to the advent of novel, pretrained contextual representation models. In particular, Devlin et al. (2019) proposed a model, called BERT…

Computation and Language · Computer Science 2020-03-09 Debora Nozza , Federico Bianchi , Dirk Hovy

VL-BERT: Pre-training of Generic Visual-Linguistic Representations

We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short). VL-BERT adopts the simple yet powerful Transformer model as the backbone, and extends it to take both…

Computer Vision and Pattern Recognition · Computer Science 2020-02-19 Weijie Su , Xizhou Zhu , Yue Cao , Bin Li , Lewei Lu , Furu Wei , Jifeng Dai

Transformer-based approaches to Sentiment Detection

The use of transfer learning methods is largely responsible for the present breakthrough in Natural Learning Processing (NLP) tasks across multiple domains. In order to solve the problem of sentiment detection, we examined the performance…

Computation and Language · Computer Science 2023-07-05 Olumide Ebenezer Ojo , Hoang Thang Ta , Alexander Gelbukh , Hiram Calvo , Olaronke Oluwayemisi Adebanji , Grigori Sidorov

BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation

The success of bidirectional encoders using masked language models, such as BERT, on numerous natural language processing tasks has prompted researchers to attempt to incorporate these pre-trained models into neural machine translation…

Computation and Language · Computer Science 2021-09-13 Haoran Xu , Benjamin Van Durme , Kenton Murray