Related papers: Verb Knowledge Injection for Multilingual Event Pr…

Reverse Transfer Learning: Can Word Embeddings Trained for Different NLP Tasks Improve Neural Language Models?

Natural language processing (NLP) tasks tend to suffer from a paucity of suitably annotated training data, hence the recent success of transfer learning across a wide variety of them. The typical recipe involves: (i) training a deep,…

Computation and Language · Computer Science 2019-09-11 Lyan Verwimp , Jerome R. Bellegarda

Plausible-Parrots @ MSP2023: Enhancing Semantic Plausibility Modeling using Entity and Event Knowledge

In this work, we investigate the effectiveness of injecting external knowledge to a large language model (LLM) to identify semantic plausibility of simple events. Specifically, we enhance the LLM with fine-grained entity types, event types…

Computation and Language · Computer Science 2024-09-02 Chong Shen , Chenyue Zhou

Plausibility Vaccine: Injecting LLM Knowledge for Event Plausibility

Despite advances in language modelling, distributional methods that build semantic representations from co-occurrences fail to discriminate between plausible and implausible events. In this work, we investigate how plausibility prediction…

Computation and Language · Computer Science 2025-03-18 Jacob Chmura , Jonah Dauvet , Sebastian Sabry

Using Prior Knowledge to Guide BERT's Attention in Semantic Textual Matching Tasks

We study the problem of incorporating prior knowledge into a deep Transformer-based model,i.e.,Bidirectional Encoder Representations from Transformers (BERT), to enhance its performance on semantic textual matching tasks. By probing and…

Computation and Language · Computer Science 2021-02-23 Tingyu Xia , Yue Wang , Yuan Tian , Yi Chang

Leveraging Grammar Induction for Language Understanding and Generation

Grammar induction has made significant progress in recent years. However, it is not clear how the application of induced grammar could enhance practical performance in downstream tasks. In this work, we introduce an unsupervised grammar…

Computation and Language · Computer Science 2024-10-08 Jushi Kai , Shengyuan Hou , Yusheng Huang , Zhouhan Lin

Knowledge-Aware Language Model Pretraining

How much knowledge do pretrained language models hold? Recent research observed that pretrained transformers are adept at modeling semantics but it is unclear to what degree they grasp human knowledge, or how to ensure they do so. In this…

Computation and Language · Computer Science 2021-02-05 Corby Rosset , Chenyan Xiong , Minh Phan , Xia Song , Paul Bennett , Saurabh Tiwary

Commonsense Knowledge Transfer for Pre-trained Language Models

Despite serving as the foundation models for a wide range of NLP benchmarks, pre-trained language models have shown limited capabilities of acquiring implicit commonsense knowledge from self-supervision alone, compared to learning…

Computation and Language · Computer Science 2023-06-06 Wangchunshu Zhou , Ronan Le Bras , Yejin Choi

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Recent breakthroughs of pretrained language models have shown the effectiveness of self-supervised learning for a wide range of natural language processing (NLP) tasks. In addition to standard syntactic and semantic NLP tasks, pretrained…

Computation and Language · Computer Science 2019-12-23 Wenhan Xiong , Jingfei Du , William Yang Wang , Veselin Stoyanov

GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method

Large pre-trained language models such as BERT have been the driving force behind recent improvements across many NLP tasks. However, BERT is only trained to predict missing words - either behind masks or in the next sentence - and has no…

Computation and Language · Computer Science 2020-10-26 Nicole Peinelt , Marek Rei , Maria Liakata

Better Neural Machine Translation by Extracting Linguistic Information from BERT

Adding linguistic information (syntax or semantics) to neural machine translation (NMT) has mostly focused on using point estimates from pre-trained models. Directly using the capacity of massive pre-trained contextual word embedding models…

Computation and Language · Computer Science 2021-04-08 Hassan S. Shavarani , Anoop Sarkar

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before…

Computation and Language · Computer Science 2020-10-13 Rong Zhang , Revanth Gangi Reddy , Md Arafat Sultan , Vittorio Castelli , Anthony Ferritto , Radu Florian , Efsun Sarioglu Kayi , Salim Roukos , Avirup Sil , Todd Ward

Revisiting Language Encoding in Learning Multilingual Representations

Transformer has demonstrated its great power to learn contextual word representations for multiple languages in a single model. To process multilingual sentences in the model, a learnable vector is usually assigned to each language, which…

Computation and Language · Computer Science 2021-02-17 Shengjie Luo , Kaiyuan Gao , Shuxin Zheng , Guolin Ke , Di He , Liwei Wang , Tie-Yan Liu

ConcEPT: Concept-Enhanced Pre-Training for Language Models

Pre-trained language models (PLMs) have been prevailing in state-of-the-art methods for natural language processing, and knowledge-enhanced PLMs are further proposed to promote model performance in knowledge-intensive tasks. However,…

Computation and Language · Computer Science 2024-01-12 Xintao Wang , Zhouhong Gu , Jiaqing Liang , Dakuan Lu , Yanghua Xiao , Wei Wang

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the…

Computation and Language · Computer Science 2024-02-21 Shahar Katz , Yonatan Belinkov , Mor Geva , Lior Wolf

Knowledgeable Prompt-tuning: Incorporating Knowledge into Prompt Verbalizer for Text Classification

Tuning pre-trained language models (PLMs) with task-specific prompts has been a promising approach for text classification. Particularly, previous studies suggest that prompt-tuning has remarkable superiority in the low-data scenario over…

Computation and Language · Computer Science 2022-03-21 Shengding Hu , Ning Ding , Huadong Wang , Zhiyuan Liu , Jingang Wang , Juanzi Li , Wei Wu , Maosong Sun

Word Sense Induction with Knowledge Distillation from BERT

Pre-trained contextual language models are ubiquitously employed for language understanding tasks, but are unsuitable for resource-constrained systems. Noncontextual word embeddings are an efficient alternative in these settings. Such…

Computation and Language · Computer Science 2023-04-24 Anik Saha , Alex Gittens , Bulent Yener

Probing Pretrained Language Models for Lexical Semantics

The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture. While prior research focused on…

Computation and Language · Computer Science 2020-10-13 Ivan Vulić , Edoardo Maria Ponti , Robert Litschko , Goran Glavaš , Anna Korhonen

Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation

Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines. In this work, we propose a novel cross-lingual transfer method for…

Computation and Language · Computer Science 2017-07-24 Ivan Vulić , Nikola Mrkšić , Anna Korhonen

Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs

As the knowledge of large language models (LLMs) becomes outdated over time, there is a growing need for efficient methods to update them, especially when injecting proprietary information. Our study reveals that comprehension-intensive…

Computation and Language · Computer Science 2025-05-26 Essa Jan , Moiz Ali , Muhammad Saram Hassan , Fareed Zaffar , Yasir Zaki

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the…

Computation and Language · Computer Science 2020-10-13 Anne Lauscher , Olga Majewska , Leonardo F. R. Ribeiro , Iryna Gurevych , Nikolai Rozanov , Goran Glavaš