Related papers: Large Language Model Unlearning

A Closer Look at Machine Unlearning for Large Language Models

Large language models (LLMs) may memorize sensitive or copyrighted content, raising privacy and legal concerns. Due to the high cost of retraining from scratch, researchers attempt to employ machine unlearning to remove specific content…

Computation and Language · Computer Science 2025-08-12 Xiaojian Yuan , Tianyu Pang , Chao Du , Kejiang Chen , Weiming Zhang , Min Lin

Machine Unlearning in Large Language Models

Recently, large language models (LLMs) have emerged as a notable field, attracting significant attention for its ability to automatically generate intelligent contents for various application domains. However, LLMs still suffer from…

Cryptography and Security · Computer Science 2024-04-29 Kongyang Chen , Zixin Wang , Bing Mi , Waixi Liu , Shaowei Wang , Xiaojun Ren , Jiaxing Shen

A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models

This study investigates the machine unlearning techniques within the context of large language models (LLMs), referred to as \textit{LLM unlearning}. LLM unlearning offers a principled approach to removing the influence of undesirable data…

Computation and Language · Computer Science 2025-06-03 Jiahui Geng , Qing Li , Herbert Woisetschlaeger , Zongxiong Chen , Fengyu Cai , Yuxia Wang , Preslav Nakov , Hans-Arno Jacobsen , Fakhri Karray

Bridging the Gap Between Preference Alignment and Machine Unlearning

Despite advances in Preference Alignment (PA) for Large Language Models (LLMs), mainstream methods like Reinforcement Learning with Human Feedback (RLHF) face notable challenges. These approaches require high-quality datasets of positive…

Machine Learning · Computer Science 2025-04-10 Xiaohua Feng , Yuyuan Li , Huwei Ji , Jiaming Zhang , Li Zhang , Tianyu Du , Chaochao Chen

Digital Forgetting in Large Language Models: A Survey of Unlearning Methods

The objective of digital forgetting is, given a model with undesirable knowledge or behavior, obtain a new model where the detected issues are no longer present. The motivations for forgetting include privacy protection, copyright…

Cryptography and Security · Computer Science 2025-01-14 Alberto Blanco-Justicia , Najeeb Jebreel , Benet Manzanares , David Sánchez , Josep Domingo-Ferrer , Guillem Collell , Kuan Eeik Tan

UNLEARN Efficient Removal of Knowledge in Large Language Models

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an…

Computation and Language · Computer Science 2024-08-09 Tyler Lizzo , Larry Heck

Unlearning in LLMs: Methods, Evaluation, and Open Challenges

Large language models (LLMs) have achieved remarkable success across natural language processing tasks, yet their widespread deployment raises pressing concerns around privacy, copyright, security, and bias. Machine unlearning has emerged…

Computation and Language · Computer Science 2026-01-21 Tyler Lizzo , Larry Heck

In-Context Unlearning: Language Models as Few Shot Unlearners

Machine unlearning, the study of efficiently removing the impact of specific training instances on a model, has garnered increased attention in recent years due to regulatory guidelines such as the \emph{Right to be Forgotten}. Achieving…

Machine Learning · Computer Science 2024-06-07 Martin Pawelczyk , Seth Neel , Himabindu Lakkaraju

A Survey on Unlearning in Large Language Models

Large Language Models (LLMs) demonstrate remarkable capabilities, but their training on massive corpora poses significant risks from memorized sensitive information. To mitigate these issues and align with legal standards, unlearning has…

Computation and Language · Computer Science 2025-11-18 Ruichen Qiu , Jiajun Tan , Jiayue Pu , Honglin Wang , Xiao-Shan Gao , Fei Sun

Rethinking Machine Unlearning for Large Language Models

We explore machine unlearning (MU) in the domain of large language models (LLMs), referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence (e.g., sensitive or illegal information) and the associated model…

Machine Learning · Computer Science 2024-12-10 Sijia Liu , Yuanshun Yao , Jinghan Jia , Stephen Casper , Nathalie Baracaldo , Peter Hase , Yuguang Yao , Chris Yuhao Liu , Xiaojun Xu , Hang Li , Kush R. Varshney , Mohit Bansal , Sanmi Koyejo , Yang Liu

Align-then-Unlearn: Embedding Alignment for LLM Unlearning

As large language models (LLMs) are trained on massive datasets, they have raised significant privacy and ethical concerns due to their potential to inadvertently retain sensitive information. Unlearning seeks to selectively remove specific…

Computation and Language · Computer Science 2025-06-17 Philipp Spohn , Leander Girrbach , Jessica Bader , Zeynep Akata

Machine Unlearning of Pre-trained Large Language Models

This study investigates the concept of the `right to be forgotten' within the context of large language models (LLMs). We explore machine unlearning as a pivotal solution, with a focus on pre-trained models--a notably under-researched area.…

Computation and Language · Computer Science 2024-05-31 Jin Yao , Eli Chien , Minxin Du , Xinyao Niu , Tianhao Wang , Zezhou Cheng , Xiang Yue

Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges

In recent years, large language models (LLMs) have spurred a new research paradigm in natural language processing. Despite their excellent capability in knowledge-based question answering and reasoning, their potential to retain faulty or…

Computation and Language · Computer Science 2023-12-11 Nianwen Si , Hao Zhang , Heyu Chang , Wenlin Zhang , Dan Qu , Weiqiang Zhang

The Frontier of Data Erasure: Machine Unlearning for Large Language Models

Large Language Models (LLMs) are foundational to AI advancements, facilitating applications like predictive text generation. Nonetheless, they pose risks by potentially memorizing and disseminating sensitive, biased, or copyrighted…

Artificial Intelligence · Computer Science 2024-03-26 Youyang Qu , Ming Ding , Nan Sun , Kanchana Thilakarathna , Tianqing Zhu , Dusit Niyato

Does Unlearning Truly Unlearn? A Black Box Evaluation of LLM Unlearning Methods

Large language model unlearning aims to remove harmful information that LLMs have learnt to prevent their use for malicious purposes. LLMU and RMU have been proposed as two methods for LLM unlearning, achieving impressive results on…

Computation and Language · Computer Science 2025-02-25 Jai Doshi , Asa Cooper Stickland

A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty

Driven by privacy protection laws and regulations, unlearning in Large Language Models (LLMs) is gaining increasing attention. However, current research often neglects the interpretability of the unlearning process, particularly concerning…

Machine Learning · Computer Science 2025-04-10 Xiaohua Feng , Yuyuan Li , Chengye Wang , Junlin Liu , Li Zhang , Chaochao Chen

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

Exact unlearning was first introduced as a privacy mechanism that allowed a user to retract their data from machine learning models on request. Shortly after, inexact schemes were proposed to mitigate the impractical costs associated with…

Machine Learning · Computer Science 2024-07-02 Ilia Shumailov , Jamie Hayes , Eleni Triantafillou , Guillermo Ortiz-Jimenez , Nicolas Papernot , Matthew Jagielski , Itay Yona , Heidi Howard , Eugene Bagdasaryan

Unlearn What You Want to Forget: Efficient Unlearning for LLMs

Large language models (LLMs) have achieved significant progress from pre-training on and memorizing a wide range of textual data, however, this process might suffer from privacy issues and violations of data protection regulations. As a…

Computation and Language · Computer Science 2023-11-01 Jiaao Chen , Diyi Yang

Towards Unveiling Vulnerabilities of Large Reasoning Models in Machine Unlearning

Large language models (LLMs) possess strong semantic understanding, driving significant progress in data mining applications. This is further enhanced by large reasoning models (LRMs), which provide explicit multi-step reasoning traces. On…

Machine Learning · Computer Science 2026-04-07 Aobo Chen , Chenxu Zhao , Chenglin Miao , Mengdi Huai

Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Difference

As Large Language Models (LLMs) demonstrate extensive capability in learning from documents, LLM unlearning becomes an increasingly important research area to address concerns of LLMs in terms of privacy, copyright, etc. A conventional LLM…

Computation and Language · Computer Science 2024-06-14 Jiabao Ji , Yujian Liu , Yang Zhang , Gaowen Liu , Ramana Rao Kompella , Sijia Liu , Shiyu Chang