Related papers: Finding patterns in Knowledge Attribution for Tran…

BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Probing complex language models has recently revealed several insights into linguistic and semantic patterns found in the learned representations. In this paper, we probe BERT specifically to understand and measure the relational knowledge…

Computation and Language · Computer Science 2021-09-09 Jonas Wallat , Jaspreet Singh , Avishek Anand

BERTnesia: Investigating the capture and forgetting of knowledge in BERT

Probing complex language models has recently revealed several insights into linguistic and semantic patterns found in the learned representations. In this article, we probe BERT specifically to understand and measure the relational…

Computation and Language · Computer Science 2021-09-09 Jonas Wallat , Jaspreet Singh , Avishek Anand

Knowledge Neurons in Pretrained Transformers

Large-scale pretrained language models are surprisingly good at recalling factual knowledge presented in the training corpus. In this paper, we present preliminary studies on how factual knowledge is stored in pretrained Transformers by…

Computation and Language · Computer Science 2022-03-11 Damai Dai , Li Dong , Yaru Hao , Zhifang Sui , Baobao Chang , Furu Wei

Probing Pretrained Language Models for Lexical Semantics

The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. While prior research focused on…

Computation and Language · Computer Science 2020-10-13 Ivan Vulić , Edoardo Maria Ponti , Robert Litschko , Goran Glavaš , Anna Korhonen

Tracing the Roots of Facts in Multilingual Language Models: Independent, Shared, and Transferred Knowledge

Acquiring factual knowledge for language models (LMs) in low-resource languages poses a serious challenge, thus resorting to cross-lingual transfer in multilingual LMs (ML-LMs). In this study, we ask how ML-LMs acquire and represent factual…

Computation and Language · Computer Science 2024-03-11 Xin Zhao , Naoki Yoshinaga , Daisuke Oba

How transfer learning impacts linguistic knowledge in deep NLP models?

Transfer learning from pre-trained neural language models towards downstream tasks has been a predominant theme in NLP recently. Several researchers have shown that deep NLP models learn non-trivial amount of linguistic knowledge, captured…

Computation and Language · Computer Science 2021-06-01 Nadir Durrani , Hassan Sajjad , Fahim Dalvi

Towards Generating Informative Textual Description for Neurons in Language Models

Recent developments in transformer-based language models have allowed them to capture a wide variety of world knowledge that can be adapted to downstream tasks with limited resources. However, what pieces of information are understood in…

Computation and Language · Computer Science 2024-01-31 Shrayani Mondal , Rishabh Garodia , Arbaaz Qureshi , Taesung Lee , Youngja Park

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Pre-trained language models (PLMs) contain vast amounts of factual knowledge, but how the knowledge is stored in the parameters remains unclear. This paper delves into the complex task of understanding how factual knowledge is stored in…

Computation and Language · Computer Science 2023-12-21 Yuheng Chen , Pengfei Cao , Yubo Chen , Kang Liu , Jun Zhao

Analyzing Individual Neurons in Pre-trained Language Models

While a lot of analysis has been carried to demonstrate linguistic knowledge captured by the representations learned within deep NLP models, very little attention has been paid towards individual neurons.We carry outa neuron-level analysis…

Computation and Language · Computer Science 2020-10-07 Nadir Durrani , Hassan Sajjad , Fahim Dalvi , Yonatan Belinkov

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the…

Computation and Language · Computer Science 2020-10-13 Anne Lauscher , Olga Majewska , Leonardo F. R. Ribeiro , Iryna Gurevych , Nikolai Rozanov , Goran Glavaš

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

We study the problem of incorporating prior knowledge into a deep Transformer-based model,i.e.,Bidirectional Encoder Representations from Transformers (BERT), to enhance its performance on semantic textual matching tasks. By probing and…

Computation and Language · Computer Science 2021-02-23 Tingyu Xia , Yue Wang , Yuan Tian , Yi Chang

Linguistic Knowledge and Transferability of Contextual Representations

Contextual word representations derived from large-scale neural language models are successful across a diverse set of NLP tasks, suggesting that they encode useful and transferable features of language. To shed light on the linguistic…

Computation and Language · Computer Science 2019-04-29 Nelson F. Liu , Matt Gardner , Yonatan Belinkov , Matthew E. Peters , Noah A. Smith

Language Representation Projection: Can We Transfer Factual Knowledge across Languages in Multilingual Language Models?

Multilingual pretrained language models serve as repositories of multilingual factual knowledge. Nevertheless, a substantial performance gap of factual knowledge probing exists between high-resource languages and low-resource languages,…

Computation and Language · Computer Science 2023-11-08 Shaoyang Xu , Junzhuo Li , Deyi Xiong

What does BERT know about books, movies and music? Probing BERT for Conversational Recommendation

Heavily pre-trained transformer models such as BERT have recently shown to be remarkably powerful at language modelling by achieving impressive results on numerous downstream tasks. It has also been shown that they are able to implicitly…

Information Retrieval · Computer Science 2021-03-05 Gustavo Penha , Claudia Hauff

One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models

Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora. Meanwhile, LLMs have also demonstrated excellent multilingual capabilities, which can express the…

Computation and Language · Computer Science 2024-11-27 Pengfei Cao , Yuheng Chen , Zhuoran Jin , Yubo Chen , Kang Liu , Jun Zhao

Roof-Transformer: Divided and Joined Understanding with Knowledge Enhancement

Recent work on enhancing BERT-based language representation models with knowledge graphs (KGs) and knowledge bases (KBs) has yielded promising results on multiple NLP tasks. State-of-the-art approaches typically integrate the original input…

Computation and Language · Computer Science 2022-10-21 Wei-Lin Liao , Cheng-En Su , Wei-Yun Ma

Multi-Head Multi-Layer Attention to Deep Language Representations for Grammatical Error Detection

It is known that a deep neural network model pre-trained with large-scale data greatly improves the accuracy of various tasks, especially when there are resource constraints. However, the information needed to solve a given task can vary,…

Computation and Language · Computer Science 2019-04-17 Masahiro Kaneko , Mamoru Komachi

An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning

Parameter-efficient fine-tuning approaches have recently garnered a lot of attention. Having considerably lower number of trainable weights, these methods can bring about scalability and computational effectiveness. In this paper, we look…

Computation and Language · Computer Science 2023-02-23 Mohammad Akbar-Tajari , Sara Rajaee , Mohammad Taher Pilehvar

A Distributional Semantics Approach to Implicit Language Learning

In the present paper we show that distributional information is particularly important when considering concept availability under implicit language learning conditions. Based on results from different behavioural experiments we argue that…

Computation and Language · Computer Science 2016-06-30 Dimitrios Alikaniotis , John N. Williams

What do End-to-End Speech Models Learn about Speaker, Language and Channel Information? A Layer-wise and Neuron-level Analysis

Deep neural networks are inherently opaque and challenging to interpret. Unlike hand-crafted feature-based models, we struggle to comprehend the concepts learned and how they interact within these models. This understanding is crucial not…

Computation and Language · Computer Science 2023-07-12 Shammur Absar Chowdhury , Nadir Durrani , Ahmed Ali