Related papers: Mitigating Memorization In Language Models

The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation

Large Language Models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks, yet they also exhibit memorization of their training data. This phenomenon raises critical questions about model behavior, privacy risks,…

Machine Learning · Computer Science 2025-12-15 Alexander Xiong , Xuandong Zhao , Aneesh Pappu , Dawn Song

Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model's tendency to overwrite previously acquired knowledge with new…

Machine Learning · Computer Science 2025-12-02 Lama Alssum , Hasan Abed Al Kader Hammoud , Motasem Alfarra , Juan C Leon Alcazar , Bernard Ghanem

Predicting memorization within Large Language Models fine-tuned for classification

Large Language Models have received significant attention due to their abilities to solve a wide range of complex tasks. However these models memorize a significant proportion of their training data, posing a serious threat when disclosed…

Cryptography and Security · Computer Science 2025-07-16 Jérémie Dentan , Davide Buscaldi , Aymen Shabou , Sonia Vanier

Detecting Memorization in Large Language Models

Large language models (LLMs) have achieved impressive results in natural language processing but are prone to memorizing portions of their training data, which can compromise evaluation metrics, raise privacy concerns, and limit…

Machine Learning · Computer Science 2024-12-03 Eduardo Slonski

Early Detection and Reduction of Memorisation for Domain Adaptation and Instruction Tuning

Although large language models excel across many tasks, they can memorise training data and thereby expose private or copyrighted text. Most defences target the pre-training stage, leaving memorisation during fine-tuning, especially for…

Computation and Language · Computer Science 2025-10-14 Dean L. Slack , Noura Al Moubayed

UNLEARN Efficient Removal of Knowledge in Large Language Models

Given the prevalence of large language models (LLMs) and the prohibitive cost of training these models from scratch, dynamically forgetting specific knowledge e.g., private or proprietary, without retraining the model has become an…

Computation and Language · Computer Science 2024-08-09 Tyler Lizzo , Larry Heck

Leaner Training, Lower Leakage: Revisiting Memorization in LLM Fine-Tuning with LoRA

Memorization in large language models (LLMs) makes them vulnerable to data extraction attacks. While pre-training memorization has been extensively studied, fewer works have explored its impact in fine-tuning, particularly for LoRA…

Machine Learning · Computer Science 2025-06-27 Fei Wang , Baochun Li

Scrub It Out! Erasing Sensitive Memorization in Code Language Models via Machine Unlearning

While Code Language Models (CLMs) have demonstrated superior performance in software engineering tasks such as code generation and summarization, recent empirical studies reveal a critical privacy vulnerability: these models exhibit…

Software Engineering · Computer Science 2025-09-18 Zhaoyang Chu , Yao Wan , Zhikun Zhang , Di Wang , Zhou Yang , Hongyu Zhang , Pan Zhou , Xuanhua Shi , Hai Jin , David Lo

Catastrophic Failure of LLM Unlearning via Quantization

Large language models (LLMs) have shown remarkable proficiency in generating text, benefiting from extensive training on vast textual corpora. However, LLMs may also acquire unwanted behaviors from the diverse and sensitive nature of their…

Computation and Language · Computer Science 2025-03-24 Zhiwei Zhang , Fali Wang , Xiaomin Li , Zongyu Wu , Xianfeng Tang , Hui Liu , Qi He , Wenpeng Yin , Suhang Wang

A Closer Look at Machine Unlearning for Large Language Models

Large language models (LLMs) may memorize sensitive or copyrighted content, raising privacy and legal concerns. Due to the high cost of retraining from scratch, researchers attempt to employ machine unlearning to remove specific content…

Computation and Language · Computer Science 2025-08-12 Xiaojian Yuan , Tianyu Pang , Chao Du , Kejiang Chen , Weiming Zhang , Min Lin

Assessing and Mitigating Data Memorization Risks in Fine-Tuned Large Language Models

Large Language Models (LLMs) have demonstrated remarkable capabilities across diverse natural language processing tasks, but their tendency to memorize training data poses significant privacy risks, particularly during fine-tuning…

Computation and Language · Computer Science 2025-08-21 Badrinath Ramakrishnan , Akshaya Balaji

Unveiling Over-Memorization in Finetuning LLMs for Reasoning Tasks

The pretrained large language models (LLMs) are finetuned with labeled data for better instruction following ability and alignment with human values. In this paper, we study the learning dynamics of LLM finetuning on reasoning tasks and…

Computation and Language · Computer Science 2025-09-30 Zhiwen Ruan , Yun Chen , Yutao Hou , Peng Li , Yang Liu , Guanhua Chen

Rote Learning Considered Useful: Generalizing over Memorized Data in LLMs

Rote learning is a memorization technique based on repetition. Many researchers argue that rote learning hinders generalization because it encourages verbatim memorization rather than deeper understanding. This concern extends even to…

Computation and Language · Computer Science 2026-03-03 Qinyuan Wu , Soumi Das , Mahsa Amani , Bishwamittra Ghosh , Mohammad Aflah Khan , Krishna P. Gummadi , Muhammad Bilal Zafar

Quantifying Memorization Across Neural Language Models

Large language models (LMs) have been shown to memorize parts of their training data, and when prompted appropriately, they will emit the memorized training data verbatim. This is undesirable because memorization violates privacy (exposing…

Machine Learning · Computer Science 2023-03-07 Nicholas Carlini , Daphne Ippolito , Matthew Jagielski , Katherine Lee , Florian Tramer , Chiyuan Zhang

Planting and Mitigating Memorized Content in Predictive-Text Language Models

Language models are widely deployed to provide automatic text completion services in user products. However, recent research has revealed that language models (especially large ones) bear considerable risk of memorizing private training…

Computation and Language · Computer Science 2022-12-19 C. M. Downey , Wei Dai , Huseyin A. Inan , Kim Laine , Saurabh Naik , Tomasz Religa

Unconsciously Forget: Mitigating Memorization; Without Knowing What is being Memorized

Recent advances in generative models have demonstrated an exceptional ability to produce highly realistic images. However, previous studies show that generated images often resemble the training data, and this problem becomes more severe as…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Er Jin , Yang Zhang , Yongli Mou , Yanfei Dong , Stefan Decker , Kenji Kawaguchi , Johannes Stegmaier

Unlocking Memorization in Large Language Models with Dynamic Soft Prompting

Pretrained large language models (LLMs) have revolutionized natural language processing (NLP) tasks such as summarization, question answering, and translation. However, LLMs pose significant security risks due to their tendency to memorize…

Computation and Language · Computer Science 2024-09-24 Zhepeng Wang , Runxue Bao , Yawen Wu , Jackson Taylor , Cao Xiao , Feng Zheng , Weiwen Jiang , Shangqian Gao , Yanfu Zhang

TernaryLLM: Ternarized Large Language Model

Large language models (LLMs) have achieved remarkable performance on Natural Language Processing (NLP) tasks, but they are hindered by high computational costs and memory requirements. Ternarization, an extreme form of quantization, offers…

Machine Learning · Computer Science 2024-06-12 Tianqi Chen , Zhe Li , Weixiang Xu , Zeyu Zhu , Dong Li , Lu Tian , Emad Barsoum , Peisong Wang , Jian Cheng

Pruning as a Defense: Reducing Memorization in Large Language Models

Large language models have been shown to memorize significant portions of their training data, which they can reproduce when appropriately prompted. This work investigates the impact of simple pruning techniques on this behavior. Our…

Machine Learning · Computer Science 2025-02-25 Mansi Gupta , Nikhar Waghela , Sarthak Gupta , Shourya Goel , Sanjif Shanmugavelu

Knowledge Unlearning for Mitigating Privacy Risks in Language Models

Pretrained Language Models (LMs) memorize a vast amount of knowledge during initial pretraining, including information that may violate the privacy of personal lives and identities. Previous work addressing privacy issues for language…

Computation and Language · Computer Science 2022-12-20 Joel Jang , Dongkeun Yoon , Sohee Yang , Sungmin Cha , Moontae Lee , Lajanugen Logeswaran , Minjoon Seo