English
Related papers

Related papers: Knowledge Unlearning for Mitigating Privacy Risks …

200 papers

LLMs have been found to memorize training textual sequences and regurgitate verbatim said sequences during text generation time. This fact is known to be the cause of privacy and related (e.g., copyright) problems. Unlearning in LLMs then…

Machine Learning · Computer Science 2024-05-07 George-Octavian Barbulescu , Peter Triantafillou

Although language models (LMs) demonstrate exceptional capabilities on various tasks, they are potentially vulnerable to extraction attacks, which represent a significant privacy risk. To mitigate the privacy concerns of LMs, machine…

Computation and Language · Computer Science 2024-06-21 Dohyun Lee , Daniel Rim , Minseok Choi , Jaegul Choo

Large Language Models are typically trained on datasets collected from the web, which may inadvertently contain harmful or sensitive personal information. To address growing privacy concerns, unlearning methods have been proposed to remove…

Machine Learning · Computer Science 2025-10-23 Xiaoyu Wu , Yifei Pang , Terrance Liu , Zhiwei Steven Wu

Large Language Models (LLMs) have shown greatly enhanced performance in recent years, attributed to increased size and extensive training data. This advancement has led to widespread interest and adoption across industries and the public.…

Computation and Language · Computer Science 2024-06-19 Victoria Smith , Ali Shahin Shamsabadi , Carolyn Ashurst , Adrian Weller

Large Language Models (LLMs) embed sensitive, human-generated data, prompting the need for unlearning methods. Although certified unlearning offers strong privacy guarantees, its restrictive assumptions make it unsuitable for LLMs, giving…

Machine Learning · Computer Science 2025-06-03 Rongzhe Wei , Mufei Li , Mohsen Ghassemi , Eleonora Kreačić , Yifan Li , Xiang Yue , Bo Li , Vamsi K. Potluru , Pan Li , Eli Chien

While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit…

Software Engineering · Computer Science 2025-09-18 Zhaoyang Chu , Yao Wan , Zhikun Zhang , Di Wang , Zhou Yang , Hongyu Zhang , Pan Zhou , Xuanhua Shi , Hai Jin , David Lo

As large language models (LLMs) are trained on massive datasets, they have raised significant privacy and ethical concerns due to their potential to inadvertently retain sensitive information. Unlearning seeks to selectively remove specific…

Computation and Language · Computer Science 2025-06-17 Philipp Spohn , Leander Girrbach , Jessica Bader , Zeynep Akata

Large language models (LLMs) have recently revolutionized language processing tasks but have also brought ethical and legal issues. LLMs have a tendency to memorize potentially private or copyrighted information present in the training…

Machine Learning · Computer Science 2025-07-21 Tamim Al Mahmud , Najeeb Jebreel , Josep Domingo-Ferrer , David Sanchez

Unlearning algorithms aim to remove deleted data's influence from trained models at a cost lower than full retraining. However, prior guarantees of unlearning in literature are flawed and don't protect the privacy of deleted records. We…

Machine Learning · Statistics 2023-02-15 Rishav Chourasia , Neil Shah

Large Language Models (LLMs) have a privacy concern because they memorize training data (including personally identifiable information (PII) like emails and phone numbers) and leak it during inference. A company can train an LLM on its…

Cryptography and Security · Computer Science 2023-07-21 Jaydeep Borkar

Large Language Models for Code (LLMs4Code) have achieved strong performance in code generation, but recent studies reveal that they may memorize and leak sensitive information contained in training data, posing serious privacy risks. To…

Cryptography and Security · Computer Science 2026-01-29 Shanzhi Gu , Zhaoyang Qu , Ruotong Geng , Mingyang Geng , Shangwen Wang , Chuanfu Xu , Haotian Wang , Zhipeng Lin , Dezun Dong

Large Language Models (LLMs) trained on extensive corpora inevitably retain sensitive data, such as personal privacy information and copyrighted material. Recent advancements in knowledge unlearning involve updating LLM parameters to erase…

Computation and Language · Computer Science 2024-10-08 Bozhong Tian , Xiaozhuan Liang , Siyuan Cheng , Qingbin Liu , Mengru Wang , Dianbo Sui , Xi Chen , Huajun Chen , Ningyu Zhang

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language processing tasks, but their tendency to memorize training data poses significant privacy risks, particularly during fine-tuning…

Computation and Language · Computer Science 2025-08-21 Badrinath Ramakrishnan , Akshaya Balaji

The growing use of large language models in sensitive domains has exposed a critical weakness: the inability to ensure that private information can be permanently forgotten. Yet these systems still lack reliable mechanisms to guarantee that…

Machine Learning · Computer Science 2025-11-14 James Jin Kang , Dang Bui , Thanh Pham , Huo-Chong Ling

Fine-tuning large language models on private data for downstream applications poses significant privacy risks in potentially exposing sensitive information. Several popular community platforms now offer convenient distribution of a large…

Machine Learning · Computer Science 2024-09-02 Md Rafi Ur Rashid , Jing Liu , Toshiaki Koike-Akino , Shagufta Mehnaz , Ye Wang

As large language models (LLMs) are applied across diverse domains, the ability to selectively unlearn specific information is becoming increasingly essential. For instance, LLMs are expected to selectively provide confidential information…

Computation and Language · Computer Science 2025-06-04 Shota Takashiro , Takeshi Kojima , Andrew Gambardella , Qi Cao , Yusuke Iwasawa , Yutaka Matsuo

Machine unlearning has emerged as a critical capability for addressing privacy, safety, and regulatory concerns in large language models (LLMs). Existing methods operate at the sequence level, applying uniform updates across all tokens…

Computation and Language · Computer Science 2026-05-07 Jiawei Wu , Doudou Zhou

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an…

Computation and Language · Computer Science 2024-08-09 Tyler Lizzo , Larry Heck

Large language models (LLMs) exhibit remarkable capabilities in understanding and generating natural language. However, these models can inadvertently memorize private information, posing significant privacy risks. This study addresses the…

Computation and Language · Computer Science 2024-09-17 Zhenhua Liu , Tong Zhu , Chuanyuan Tan , Wenliang Chen

Large language models (LLMs) have become the backbone of modern natural language processing but pose privacy concerns about leaking sensitive training data. Membership inference attacks (MIAs), which aim to infer whether a sample is…

Machine Learning · Computer Science 2025-06-03 Toan Tran , Ruixuan Liu , Li Xiong
‹ Prev 1 2 3 10 Next ›