Related papers: Towards Privacy-Preserving Code Generation: Differ…

PrivCode: When Code Generation Meets Differential Privacy

Large language models (LLMs) have presented outstanding performance in code generation and completion. However, fine-tuning these models on private datasets can raise privacy and proprietary concerns, such as the leakage of sensitive…

Cryptography and Security · Computer Science 2026-01-16 Zheng Liu , Chen Gong , Terry Yue Zhuo , Kecen Li , Weichen Yu , Matt Fredrikson , Tianhao Wang

Can Differentially Private Fine-tuning LLMs Protect Against Privacy Attacks?

Fine-tuning large language models (LLMs) has become an essential strategy for adapting them to specialized tasks; however, this process introduces significant privacy challenges, as sensitive training data may be inadvertently memorized and…

Cryptography and Security · Computer Science 2025-05-02 Hao Du , Shang Liu , Yang Cao

Protecting Private Code in IDE Autocomplete using Differential Privacy

Modern Integrated Development Environments (IDEs) increasingly leverage Large Language Models (LLMs) to provide advanced features like code autocomplete. While powerful, training these models on user-written code introduces significant…

Cryptography and Security · Computer Science 2026-02-02 Evgeny Grigorenko , David Stanojević , David Ilić , Egor Bogomolov , Kostadin Cvejoski

SafeSynthDP: Leveraging Large Language Models for Privacy-Preserving Synthetic Data Generation Using Differential Privacy

Machine learning (ML) models frequently rely on training data that may include sensitive or personal information, raising substantial privacy concerns. Legislative frameworks such as the General Data Protection Regulation (GDPR) and the…

Machine Learning · Computer Science 2024-12-31 Md Mahadi Hasan Nahid , Sadid Bin Hasan

Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning

Large language models (LLMs) have emerged as powerful tools for tackling complex tasks across diverse domains, but they also raise privacy concerns when fine-tuned on sensitive data due to potential memorization. While differential privacy…

Computation and Language · Computer Science 2024-08-19 Lynn Chua , Badih Ghazi , Yangsibo Huang , Pritish Kamath , Ravi Kumar , Daogao Liu , Pasin Manurangsi , Amer Sinha , Chiyuan Zhang

Efficient and Private: Memorisation under differentially private parameter-efficient fine-tuning in language models

Fine-tuning large language models (LLMs) for specific tasks introduces privacy risks, as models may inadvertently memorise and leak sensitive training data. While Differential Privacy (DP) offers a solution to mitigate these risks, it…

Machine Learning · Computer Science 2024-11-26 Olivia Ma , Jonathan Passerat-Palmbach , Dmitrii Usynin

Differentially Private Natural Language Models: Recent Advances and Future Directions

Recent developments in deep learning have led to great success in various natural language processing (NLP) tasks. However, these applications may involve data that contain sensitive information. Therefore, how to achieve good performance…

Computation and Language · Computer Science 2023-10-24 Lijie Hu , Ivan Habernal , Lei Shen , Di Wang

How Does Differential Privacy Affect Social Bias in LLMs? A Systematic Evaluation

Large language models (LLMs) trained on web-scale corpora can memorize sensitive training data, posing significant privacy risks. Differential privacy (DP) has emerged as a principled framework that limits the influence of individual data…

Computation and Language · Computer Science 2026-05-13 Eduardo Tenorio , Karuna Bhaila , Xintao Wu

Privacy Preserving In-Context-Learning Framework for Large Language Models

Large language models (LLMs) have significantly transformed natural language understanding and generation, but they raise privacy concerns due to potential exposure of sensitive information. Studies have highlighted the risk of information…

Machine Learning · Computer Science 2025-11-20 Bishnu Bhusal , Manoj Acharya , Ramneet Kaur , Colin Samplawski , Anirban Roy , Adam D. Cobb , Rohit Chadha , Susmit Jha

Differentially Private Language Models for Secure Data Sharing

To protect the privacy of individuals whose data is being shared, it is of high importance to develop methods allowing researchers and companies to release textual data while providing formal privacy guarantees to its originators. In the…

Machine Learning · Computer Science 2022-10-27 Justus Mattern , Zhijing Jin , Benjamin Weggenmann , Bernhard Schoelkopf , Mrinmaya Sachan

Does Differential Privacy Impact Bias in Pretrained NLP Models?

Differential privacy (DP) is applied when fine-tuning pre-trained large language models (LLMs) to limit leakage of training examples. While most DP research has focused on improving a model's privacy-utility tradeoff, some find that DP can…

Computation and Language · Computer Science 2024-10-25 Md. Khairul Islam , Andrew Wang , Tianhao Wang , Yangfeng Ji , Judy Fox , Jieyu Zhao

Thinking Outside of the Differential Privacy Box: A Case Study in Text Privatization with Language Model Prompting

The field of privacy-preserving Natural Language Processing has risen in popularity, particularly at a time when concerns about privacy grow with the proliferation of Large Language Models. One solution consistently appearing in recent…

Computation and Language · Computer Science 2024-10-02 Stephen Meisenbacher , Florian Matthes

Differentially Private Language Models Benefit from Public Pre-training

Language modeling is a keystone task in natural language processing. When training a language model on sensitive information, differential privacy (DP) allows us to quantify the degree to which our private data is protected. However,…

Machine Learning · Computer Science 2020-10-27 Gavin Kerrigan , Dylan Slack , Jens Tuyls

DP-MemArc: Differential Privacy Transfer Learning for Memory Efficient Language Models

Large language models have repeatedly shown outstanding performance across diverse applications. However, deploying these models can inadvertently risk user privacy. The significant memory demands during training pose a major challenge in…

Cryptography and Security · Computer Science 2025-02-21 Yanming Liu , Xinyue Peng , Yuwei Zhang , Xiaolan Ke , Songhang Deng , Jiannan Cao , Chen Ma , Mengchen Fu , Tianyu Du , Sheng Cheng , Xun Wang , Jianwei Yin , Xuhong Zhang

Efficient DP-SGD for LLMs with Randomized Clipping

Large language models (LLMs) are trained on vast datasets that may contain sensitive information. Differential privacy (DP), the de facto standard for formal privacy guarantees, provides a principled framework for training LLMs with…

Machine Learning · Computer Science 2026-05-26 Enayat Ullah , Sai Aparna Aketi , Devansh Gupta , Huanyu Zhang , Meisam Razaviyayn

Dual-Priv Pruning : Efficient Differential Private Fine-Tuning in Multimodal Large Language Models

Differential Privacy (DP) is a widely adopted technique, valued for its effectiveness in protecting the privacy of task-specific datasets, making it a critical tool for large language models. However, its effectiveness in Multimodal Large…

Cryptography and Security · Computer Science 2025-06-10 Qianshan Wei , Jiaqi Li , Zihan You , Yi Zhan , Kecen Li , Jialin Wu , Xinfeng Li Hengjun Liu , Yi Yu , Bin Cao , Yiwen Xu , Yang Liu , Guilin Qi

Adaptive Token-Weighted Differential Privacy for LLMs: Not All Tokens Require Equal Protection

Large language models (LLMs) frequently memorize sensitive or personal information, raising significant privacy concerns. Existing variants of differential privacy stochastic gradient descent (DPSGD) inject uniform noise into every gradient…

Machine Learning · Computer Science 2025-09-30 Manjiang Yu , Priyanka Singh , Xue Li , Yang Cao

DP-SelFT: Differentially Private Selective Fine-Tuning for Large Language Models

Large language models (LLMs) are commonly adapted to downstream tasks through fine-tuning, but fine-tuning data often contains sensitive information that may be leaked by the resulting model. Differential privacy (DP) offers formal…

Machine Learning · Computer Science 2026-05-19 Haichao Sha , Zihao Wang , Yuncheng Wu , Hong Chen , Wei Dong

Gradient Masking and the Underestimated Robustness Threats of Differential Privacy in Deep Learning

An important problem in deep learning is the privacy and security of neural networks (NNs). Both aspects have long been considered separately. To date, it is still poorly understood how privacy enhancing training affects the robustness of…

Cryptography and Security · Computer Science 2021-05-18 Franziska Boenisch , Philip Sperl , Konstantin Böttinger

Differentially Private Training of Mixture of Experts Models

This position paper investigates the integration of Differential Privacy (DP) in the training of Mixture of Experts (MoE) models within the field of natural language processing. As Large Language Models (LLMs) scale to billions of…

Cryptography and Security · Computer Science 2024-02-13 Pierre Tholoniat , Huseyin A. Inan , Janardhan Kulkarni , Robert Sim