Related papers: Erasing Conceptual Knowledge from Language Models

UNLEARN Efficient Removal of Knowledge in Large Language Models

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an…

Computation and Language · Computer Science 2024-08-09 Tyler Lizzo , Larry Heck

ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced…

Computer Vision and Pattern Recognition · Computer Science 2026-05-18 Shen Lin , Jing Lin , Junhao Dong , Piotr Koniusz , Li Xu

Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis

Large language model unlearning has garnered increasing attention due to its potential to address security and privacy concerns, leading to extensive research in the field. However, much of this research has concentrated on instance-level…

Computation and Language · Computer Science 2025-05-20 Weitao Ma , Xiaocheng Feng , Weihong Zhong , Lei Huang , Yangfan Ye , Xiachong Feng , Bing Qin

Eraser: Jailbreaking Defense in Large Language Models via Unlearning Harmful Knowledge

Jailbreaking attacks can enable Large Language Models (LLMs) to bypass the safeguard and generate harmful content. Existing jailbreaking defense methods have failed to address the fundamental issue that harmful knowledge resides within the…

Computation and Language · Computer Science 2024-07-04 Weikai Lu , Ziqian Zeng , Jianwei Wang , Zhengdong Lu , Zelin Chen , Huiping Zhuang , Cen Chen

Exclusive Unlearning

When introducing Large Language Models (LLMs) into industrial applications, such as healthcare and education, the risk of generating harmful content becomes a significant challenge. While existing machine unlearning methods can erase…

Computation and Language · Computer Science 2026-04-08 Mutsumi Sasaki , Kouta Nakayama , Yusuke Miyao , Yohei Oseki , Masaru Isonuma

The Frontier of Data Erasure: Machine Unlearning for Large Language Models

Large Language Models (LLMs) are foundational to AI advancements, facilitating applications like predictive text generation. Nonetheless, they pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted…

Artificial Intelligence · Computer Science 2024-03-26 Youyang Qu , Ming Ding , Nan Sun , Kanchana Thilakarathna , Tianqing Zhu , Dusit Niyato

Intrinsic Test of Unlearning Using Parametric Knowledge Traces

The task of "unlearning" certain concepts in large language models (LLMs) has attracted immense attention recently, due to its importance in mitigating undesirable model behaviours, such as the generation of harmful, private, or incorrect…

Computation and Language · Computer Science 2025-09-03 Yihuai Hong , Lei Yu , Haiqin Yang , Shauli Ravfogel , Mor Geva

Compensation-free Machine Unlearning in Text-to-Image Diffusion Models by Eliminating the Mutual Information

The powerful generative capabilities of diffusion models have raised growing privacy and safety concerns regarding generating sensitive or undesired content. In response, machine unlearning (MU) -- commonly referred to as concept erasure…

Machine Learning · Computer Science 2026-03-03 Xinwen Cheng , Jingyuan Zhang , Zhehao Huang , Yingwen Wu , Xiaolin Huang

Unlearning in LLMs: Methods, Evaluation, and Open Challenges

Large language models (LLMs) have achieved remarkable success across natural language processing tasks, yet their widespread deployment raises pressing concerns around privacy, copyright, security, and bias. Machine unlearning has emerged…

Computation and Language · Computer Science 2026-01-21 Tyler Lizzo , Larry Heck

ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent

Large Language Models (LLMs) have demonstrated impressive capabilities in language generation and general task performance. However, their application to spoken language understanding (SLU) remains challenging, particularly for token-level…

Computation and Language · Computer Science 2025-10-09 Shangjian Yin , Peijie Huang , Jiatian Chen , Haojing Huang , Yuhong Xu

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning

As large language models (LLMs) are applied across diverse domains, the ability to selectively unlearn specific information is becoming increasingly essential. For instance, LLMs are expected to selectively provide confidential information…

Computation and Language · Computer Science 2025-06-04 Shota Takashiro , Takeshi Kojima , Andrew Gambardella , Qi Cao , Yusuke Iwasawa , Yutaka Matsuo

CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept

Large Language Models (LLMs) offer extensive knowledge across various domains, but they may inadvertently memorize sensitive, unauthorized, or malicious data, such as personal information in the medical and financial sectors. Machine…

Computation and Language · Computer Science 2024-10-16 YuXuan Wu , Bonaventure F. P. Dossou , Dianbo Liu

In-Context Unlearning: Language Models as Few Shot Unlearners

Machine unlearning, the study of efficiently removing the impact of specific training instances on a model, has garnered increased attention in recent years due to regulatory guidelines such as the \emph{Right to be Forgotten}. Achieving…

Machine Learning · Computer Science 2024-06-07 Martin Pawelczyk , Seth Neel , Himabindu Lakkaraju

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for…

Computation and Language · Computer Science 2024-06-18 Zhuoran Jin , Pengfei Cao , Chenhao Wang , Zhitao He , Hongbang Yuan , Jiachun Li , Yubo Chen , Kang Liu , Jun Zhao

Unlearn What You Want to Forget: Efficient Unlearning for LLMs

Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data, however, this process might suffer from privacy issues and violations of data protection regulations. As a…

Computation and Language · Computer Science 2023-11-01 Jiaao Chen , Diyi Yang

RESTOR: Knowledge Recovery in Machine Unlearning

Large language models trained on web-scale corpora can memorize undesirable data containing misinformation, copyrighted material, or private or sensitive information. Recently, several machine unlearning algorithms have been proposed to…

Computation and Language · Computer Science 2025-05-27 Keivan Rezaei , Khyathi Chandu , Soheil Feizi , Yejin Choi , Faeze Brahman , Abhilasha Ravichander

Scrub It Out! Erasing Sensitive Memorization in Code Language Models via Machine Unlearning

While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit…

Software Engineering · Computer Science 2025-09-18 Zhaoyang Chu , Yao Wan , Zhikun Zhang , Di Wang , Zhou Yang , Hongyu Zhang , Pan Zhou , Xuanhua Shi , Hai Jin , David Lo

Align-then-Unlearn: Embedding Alignment for LLM Unlearning

As large language models (LLMs) are trained on massive datasets, they have raised significant privacy and ethical concerns due to their potential to inadvertently retain sensitive information. Unlearning seeks to selectively remove specific…

Computation and Language · Computer Science 2025-06-17 Philipp Spohn , Leander Girrbach , Jessica Bader , Zeynep Akata

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential…

Computation and Language · Computer Science 2025-10-23 Rongzhe Wei , Peizhi Niu , Hans Hao-Hsun Hsu , Ruihan Wu , Haoteng Yin , Mohsen Ghassemi , Yifan Li , Vamsi K. Potluru , Eli Chien , Kamalika Chaudhuri , Olgica Milenkovic , Pan Li

REMIND: Input Loss Landscapes Reveal Residual Memorization in Post-Unlearning LLMs

Machine unlearning aims to remove the influence of specific training data from a model without requiring full retraining. This capability is crucial for ensuring privacy, safety, and regulatory compliance. Therefore, verifying whether a…

Computation and Language · Computer Science 2025-11-07 Liran Cohen , Yaniv Nemcovesky , Avi Mendelson