Related papers: Erasing Without Remembering: Implicit Knowledge Fo…

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning

As large language models (LLMs) are applied across diverse domains, the ability to selectively unlearn specific information is becoming increasingly essential. For instance, LLMs are expected to selectively provide confidential information…

Computation and Language · Computer Science 2025-06-04 Shota Takashiro , Takeshi Kojima , Andrew Gambardella , Qi Cao , Yusuke Iwasawa , Yutaka Matsuo

Understanding the Dilemma of Unlearning for Large Language Models

Unlearning seeks to remove specific knowledge from large language models (LLMs), but its effectiveness remains contested. On one side, "forgotten" knowledge can often be recovered through interventions such as light fine-tuning; on the…

Computation and Language · Computer Science 2025-09-30 Qingjie Zhang , Haoting Qian , Zhicong Huang , Cheng Hong , Minlie Huang , Ke Xu , Chao Zhang , Han Qiu

Forget to Know, Remember to Use: Context-Aware Unlearning for Large Language Models

Large language models may encode sensitive information or outdated knowledge that needs to be removed, to ensure responsible and compliant model responses. Unlearning has emerged as an efficient alternative to full retraining, aiming to…

Computation and Language · Computer Science 2026-05-28 Yuefeng Peng , Parnian Afshar , Megan Ganji , Thomas Butler , Amir Houmansadr , Mingxian Wang , Dezhi Hong

UNLEARN Efficient Removal of Knowledge in Large Language Models

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an…

Computation and Language · Computer Science 2024-08-09 Tyler Lizzo , Larry Heck

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for…

Computation and Language · Computer Science 2024-06-18 Zhuoran Jin , Pengfei Cao , Chenhao Wang , Zhitao He , Hongbang Yuan , Jiachun Li , Yubo Chen , Kang Liu , Jun Zhao

Towards Robust Evaluation of Unlearning in LLMs via Data Transformations

Large Language Models (LLMs) have shown to be a great success in a wide range of applications ranging from regular NLP-based use cases to AI agents. LLMs have been trained on a vast corpus of texts from various sources; despite the best…

Computation and Language · Computer Science 2024-11-26 Abhinav Joshi , Shaswati Saha , Divyaksh Shukla , Sriram Vema , Harsh Jhamtani , Manas Gaur , Ashutosh Modi

ReLearn: Unlearning via Learning for Large Language Models

Current unlearning methods for large language models usually rely on reverse optimization to reduce target token probabilities. However, this paradigm disrupts the subsequent tokens prediction, degrading model performance and linguistic…

Computation and Language · Computer Science 2025-05-29 Haoming Xu , Ningyuan Zhao , Liming Yang , Sendong Zhao , Shumin Deng , Mengru Wang , Bryan Hooi , Nay Oo , Huajun Chen , Ningyu Zhang

UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets

Large Language Models (LLMs) inevitably acquire harmful information during training on massive datasets. LLM unlearning aims to eliminate the influence of such harmful information while maintaining the model's overall performance. Existing…

Computation and Language · Computer Science 2025-03-07 Wenyu Wang , Mengqi Zhang , Xiaotian Ye , Zhaochun Ren , Zhumin Chen , Pengjie Ren

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential…

Computation and Language · Computer Science 2025-10-23 Rongzhe Wei , Peizhi Niu , Hans Hao-Hsun Hsu , Ruihan Wu , Haoteng Yin , Mohsen Ghassemi , Yifan Li , Vamsi K. Potluru , Eli Chien , Kamalika Chaudhuri , Olgica Milenkovic , Pan Li

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

Unlearning in large language models (LLMs) is critical for regulatory compliance and for building ethical generative AI systems that avoid producing private, toxic, illegal, or copyrighted content. Despite rapid progress, in this work, we…

Machine Learning · Computer Science 2026-05-29 Hadi Reisizadeh , Jiajun Ruan , Yiwei Chen , Soumyadeep Pal , Sijia Liu , Mingyi Hong

Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs

Unlearning in large language models (LLMs) aims to remove specified data, but its efficacy is typically assessed with task-level metrics like accuracy and perplexity. We show that these metrics can be misleading, as models can appear to…

Computation and Language · Computer Science 2026-05-19 Xiaoyu Xu , Xiang Yue , Yang Liu , Qingqing Ye , Huadi Zheng , Peizhao Hu , Minxin Du , Haibo Hu

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods

The objective of digital forgetting is, given a model with undesirable knowledge or behavior, obtain a new model where the detected issues are no longer present. The motivations for forgetting include privacy protection, copyright…

Cryptography and Security · Computer Science 2025-01-14 Alberto Blanco-Justicia , Najeeb Jebreel , Benet Manzanares , David Sánchez , Josep Domingo-Ferrer , Guillem Collell , Kuan Eeik Tan

To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models

Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating LLM parameters to erase…

Computation and Language · Computer Science 2024-10-08 Bozhong Tian , Xiaozhuan Liang , Siyuan Cheng , Qingbin Liu , Mengru Wang , Dianbo Sui , Xi Chen , Huajun Chen , Ningyu Zhang

Exploring Forgetting in Large Language Model Pre-Training

Catastrophic forgetting remains a formidable obstacle to building an omniscient model in large language models (LLMs). Despite the pioneering research on task-level forgetting in LLM fine-tuning, there is scant focus on forgetting during…

Computation and Language · Computer Science 2024-10-23 Chonghua Liao , Ruobing Xie , Xingwu Sun , Haowen Sun , Zhanhui Kang

TOFU: A Task of Fictitious Unlearning for LLMs

Large language models trained on massive corpora of data from the web can memorize and reproduce sensitive or private data raising both legal and ethical concerns. Unlearning, or tuning models to forget information present in their training…

Machine Learning · Computer Science 2024-01-12 Pratyush Maini , Zhili Feng , Avi Schwarzschild , Zachary C. Lipton , J. Zico Kolter

Secure Forgetting: A Framework for Privacy-Driven Unlearning in Large Language Model (LLM)-Based Agents

Large language model (LLM)-based agents have recently gained considerable attention due to the powerful reasoning capabilities of LLMs. Existing research predominantly focuses on enhancing the task performance of these agents in diverse…

Multiagent Systems · Computer Science 2026-04-02 Dayong Ye , Tainqing Zhu , Congcong Zhu , Feng He , Qi He , Shang Wang , Bo Liu , Wanlei Zhou

LLM Unlearning on Noisy Forget Sets: A Study of Incomplete, Rewritten, and Watermarked Data

Large language models (LLMs) exhibit remarkable generative capabilities but raise ethical and security concerns by memorizing sensitive data, reinforcing biases, and producing harmful content. These risks have spurred interest in LLM…

Machine Learning · Computer Science 2025-10-13 Changsheng Wang , Yihua Zhang , Dennis Wei , Jinghan Jia , Pin-Yu Chen , Sijia Liu

Not All Tokens Are Meant to Be Forgotten

Large Language Models (LLMs), pre-trained on massive text corpora, exhibit remarkable human-level language understanding, reasoning, and decision-making abilities. However, they tend to memorize unwanted information, such as private or…

Machine Learning · Computer Science 2026-01-01 Xiangyu Zhou , Yao Qiang , Saleh Zare Zade , Douglas Zytko , Prashant Khanduri , Dongxiao Zhu

A Survey on Unlearning in Large Language Models

Large Language Models (LLMs) demonstrate remarkable capabilities, but their training on massive corpora poses significant risks from memorized sensitive information. To mitigate these issues and align with legal standards, unlearning has…

Computation and Language · Computer Science 2025-11-18 Ruichen Qiu , Jiajun Tan , Jiayue Pu , Honglin Wang , Xiao-Shan Gao , Fei Sun

Not Every Token Needs Forgetting: Selective Unlearning to Limit Change in Utility in Large Language Model Unlearning

Large Language Model (LLM) unlearning has recently gained significant attention, driven by the need to remove unwanted information, such as private, sensitive, or copyrighted content, from LLMs. However, conventional unlearning approaches…

Computation and Language · Computer Science 2025-06-03 Yixin Wan , Anil Ramakrishna , Kai-Wei Chang , Volkan Cevher , Rahul Gupta