Related papers: Knowledge Infused Decoding

LFD: Layer Fused Decoding to Exploit External Knowledge in Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) incorporates external knowledge into large language models (LLMs), improving their adaptability to downstream tasks and enabling information updates. Surprisingly, recent empirical evidence demonstrates…

Computation and Language · Computer Science 2026-01-08 Yang Sun , Zhiyong Xie , Lixin Zou , Dan Luo , Min Tang , Xiangyu Zhao , Yunwei Zhao , Xixun Lin , Yanxiong Lu , Chenliang Li

KID: Knowledge-Injected Dual-Head Learning for Knowledge-Grounded Harmful Meme Detection

Internet memes have become pervasive carriers of digital culture on social platforms. However, their heavy reliance on metaphors and sociocultural context also makes them subtle vehicles for harmful content, posing significant challenges…

Computation and Language · Computer Science 2026-01-30 Yaocong Li , Leihan Zhang , Le Zhang , Qiang Yan

KILM: Knowledge Injection into Encoder-Decoder Language Models

Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters. To enhance this implicit knowledge, we propose Knowledge Injection into Language Models (KILM), a novel approach that injects…

Computation and Language · Computer Science 2023-02-21 Yan Xu , Mahdi Namazifar , Devamanyu Hazarika , Aishwarya Padmakumar , Yang Liu , Dilek Hakkani-Tür

A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training

Modern Natural Language Generation (NLG) models come with massive computational and storage requirements. In this work, we study the potential of compressing them, which is crucial for real-world applications serving millions of users. We…

Computation and Language · Computer Science 2023-05-29 Nitay Calderon , Subhabrata Mukherjee , Roi Reichart , Amir Kantor

Knowledge-Augmented Reasoning Distillation for Small Language Models in Knowledge-Intensive Tasks

Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks that require a compound understanding of knowledge. However, deployment of the LLMs in real-world applications can be challenging due to…

Computation and Language · Computer Science 2023-10-31 Minki Kang , Seanie Lee , Jinheon Baek , Kenji Kawaguchi , Sung Ju Hwang

KUDA: Knowledge Unlearning by Deviating Representation for Large Language Models

Large language models (LLMs) acquire a large amount of knowledge through pre-training on vast and diverse corpora. While this endows LLMs with strong capabilities in generation and reasoning, it amplifies risks associated with sensitive,…

Cryptography and Security · Computer Science 2026-02-25 Ce Fang , Zhikun Zhang , Min Chen , Qing Liu , Lu Zhou , Zhe Liu , Yunjun Gao

Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL

Large Language Models (LLMs) have shown promising performance in text-to-SQL, which involves translating natural language questions into SQL queries. However, current text-to-SQL LLMs are computationally expensive and challenging to deploy…

Computation and Language · Computer Science 2024-10-16 Qihuang Zhong , Kunfeng Chen , Liang Ding , Juhua Liu , Bo Du , Dacheng Tao

Knowledge Integration Decay in Search-Augmented Reasoning of Large Language Models

Modern Large Language Models (LLMs) have demonstrated remarkable capabilities in complex tasks by employing search-augmented reasoning to incorporate external knowledge into long chains of thought. However, we identify a critical yet…

Computation and Language · Computer Science 2026-02-11 Sangwon Yu , Ik-hwan Kim , Donghun Kang , Bongkyu Hwang , Junhwa Choi , Suk-hoon Jung , Seungki Hong , Taehee Lee , Sungroh Yoon

Lexical Knowledge Internalization for Neural Dialog Generation

We propose knowledge internalization (KI), which aims to complement the lexical knowledge into neural dialog models. Instead of further conditioning the knowledge-grounded dialog (KGD) models on externally retrieved knowledge, we seek to…

Computation and Language · Computer Science 2022-05-05 Zhiyong Wu , Wei Bi , Xiang Li , Lingpeng Kong , Ben Kao

A Cohesive Distillation Architecture for Neural Language Models

A recent trend in Natural Language Processing is the exponential growth in Language Model (LM) size, which prevents research groups without a necessary hardware infrastructure from participating in the development process. This study…

Computation and Language · Computer Science 2023-01-31 Jan Philip Wahle

What Has Been Enhanced in my Knowledge-Enhanced Language Model?

Pretrained language models (LMs) do not capture factual knowledge very well. This has led to the development of a number of knowledge integration (KI) methods which aim to incorporate external knowledge into pretrained LMs. Even though KI…

Computation and Language · Computer Science 2022-11-17 Yifan Hou , Guoji Fu , Mrinmaya Sachan

Enhancing Multilingual Language Model with Massive Multilingual Knowledge Triples

Knowledge-enhanced language representation learning has shown promising results across various knowledge-intensive NLP tasks. However, prior methods are limited in efficient utilization of multilingual knowledge graph (KG) data for language…

Computation and Language · Computer Science 2022-10-20 Linlin Liu , Xin Li , Ruidan He , Lidong Bing , Shafiq Joty , Luo Si

An Empirical Study of Knowledge Distillation for Code Understanding Tasks

Pre-trained language models (PLMs) have emerged as powerful tools for code understanding. However, deploying these PLMs in large-scale applications faces practical challenges due to their computational intensity and inference latency.…

Software Engineering · Computer Science 2025-08-22 Ruiqi Wang , Zezhou Yang , Cuiyun Gao , Xin Xia , Qing Liao

Deepfake Detection via Knowledge Injection

Deepfake detection technologies become vital because current generative AI models can generate realistic deepfakes, which may be utilized in malicious purposes. Existing deepfake detection methods either rely on developing classification…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Tonghui Li , Yuanfang Guo , Zeming Liu , Heqi Peng , Yunhong Wang

KIND: Knowledge Integration and Diversion for Training Decomposable Models

Pre-trained models have become the preferred backbone due to the increasing complexity of model parameters. However, traditional pre-trained models often face deployment challenges due to their fixed sizes, and are prone to negative…

Computer Vision and Pattern Recognition · Computer Science 2025-05-21 Yucheng Xie , Fu Feng , Ruixiao Shi , Jing Wang , Yong Rui , Xin Geng

Where Knowledge Collides: A Mechanistic Study of Intra-Memory Knowledge Conflict in Language Models

In language models (LMs), intra-memory knowledge conflict largely arises when inconsistent information about the same event is encoded within the model's parametric knowledge. While prior work has primarily focused on resolving conflicts…

Computation and Language · Computer Science 2026-01-15 Minh Vu Pham , Hsuvas Borkakoty , Yufang Hou

Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

Large Language Models (LLMs) often struggle with producing factually consistent answers due to limitations in their parametric memory. Retrieval-Augmented Generation (RAG) paradigms mitigate this issue by incorporating external knowledge at…

Computation and Language · Computer Science 2026-05-05 Shanglin Wu , Lihui Liu , Jinho D. Choi , Kai Shu

Comparing Knowledge Injection Methods for LLMs in a Low-Resource Regime

Large language models (LLMs) often require vast amounts of text to effectively acquire new knowledge. While continuing pre-training on large corpora or employing retrieval-augmented generation (RAG) has proven successful, updating an LLM…

Computation and Language · Computer Science 2025-08-11 Hugo Abonizio , Thales Almeida , Roberto Lotufo , Rodrigo Nogueira

SKILL: Structured Knowledge Infusion for Large Language Models

Large language models (LLMs) have demonstrated human-level performance on a vast spectrum of natural language tasks. However, it is largely unexplored whether they can better internalize knowledge from a structured data, such as a knowledge…

Computation and Language · Computer Science 2022-05-18 Fedor Moiseev , Zhe Dong , Enrique Alfonseca , Martin Jaggi

Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models

While Large Vision-Language Models (LVLMs) have rapidly advanced in recent years, the prevalent issue known as the `hallucination' problem has emerged as a significant bottleneck, hindering their real-world deployments. Existing methods…

Computer Vision and Pattern Recognition · Computer Science 2025-03-18 Fushuo Huo , Wenchao Xu , Zhong Zhang , Haozhao Wang , Zhicheng Chen , Peilin Zhao