Related papers: Large-Scale Differentially Private BERT

Large Scale Transfer Learning for Differentially Private Image Classification

Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy. In the field of deep learning, Differentially Private Stochastic Gradient Descent (DP-SGD) has emerged as a…

Machine Learning · Computer Science 2022-05-24 Harsh Mehta , Abhradeep Thakurta , Alexey Kurakin , Ashok Cutkosky

Large Language Models Can Be Strong Differentially Private Learners

Differentially Private (DP) learning has seen limited success for building large deep learning models of text, and straightforward attempts at applying Differentially Private Stochastic Gradient Descent (DP-SGD) to NLP tasks have resulted…

Machine Learning · Computer Science 2022-11-11 Xuechen Li , Florian Tramèr , Percy Liang , Tatsunori Hashimoto

Differentially Private Fine-tuning of Language Models

We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a…

Machine Learning · Computer Science 2022-07-18 Da Yu , Saurabh Naik , Arturs Backurs , Sivakanth Gopi , Huseyin A. Inan , Gautam Kamath , Janardhan Kulkarni , Yin Tat Lee , Andre Manoel , Lukas Wutschitz , Sergey Yekhanin , Huishuai Zhang

Differentially Private Model Compression

Recent papers have shown that large pre-trained language models (LLMs) such as BERT, GPT-2 can be fine-tuned on private data to achieve performance comparable to non-private models for many downstream Natural Language Processing (NLP) tasks…

Machine Learning · Computer Science 2022-06-07 Fatemehsadat Mireshghallah , Arturs Backurs , Huseyin A Inan , Lukas Wutschitz , Janardhan Kulkarni

Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning

Fine-tuning on task-specific datasets is a widely-embraced paradigm of harnessing the powerful capability of pretrained LLMs for various downstream tasks. Due to the popularity of LLMs fine-tuning and its accompanying privacy concerns,…

Machine Learning · Computer Science 2025-03-11 Z Liu , J Lou , W Bao , Y Hu , B Li , Z Qin , K Ren

Fine-Tuning Large Language Models with User-Level Differential Privacy

We investigate practical and scalable algorithms for training large language models (LLMs) with user-level differential privacy (DP) in order to provably safeguard all the examples contributed by each user. We study two variants of DP-SGD…

Machine Learning · Computer Science 2024-07-11 Zachary Charles , Arun Ganesh , Ryan McKenna , H. Brendan McMahan , Nicole Mitchell , Krishna Pillutla , Keith Rush

An Efficient DP-SGD Mechanism for Large Scale NLP Models

Recent advances in deep learning have drastically improved performance on many Natural Language Understanding (NLU) tasks. However, the data used to train NLU models may contain private information such as addresses or phone numbers,…

Computation and Language · Computer Science 2022-03-03 Christophe Dupuy , Radhika Arava , Rahul Gupta , Anna Rumshisky

Unlocking High-Accuracy Differentially Private Image Classification through Scale

Differential Privacy (DP) provides a formal privacy guarantee preventing adversaries with access to a machine learning model from extracting information about individual training points. Differentially Private Stochastic Gradient Descent…

Machine Learning · Computer Science 2022-06-17 Soham De , Leonard Berrada , Jamie Hayes , Samuel L. Smith , Borja Balle

Training Large ASR Encoders with Differential Privacy

Self-supervised learning (SSL) methods for large speech models have proven to be highly effective at ASR. With the interest in public deployment of large pre-trained models, there is a rising concern for unintended memorization and leakage…

Sound · Computer Science 2024-09-24 Geeticka Chauhan , Steve Chien , Om Thakkar , Abhradeep Thakurta , Arun Narayanan

DP-FP: Differentially Private Forward Propagation for Large Models

When applied to large-scale learning problems, the conventional wisdom on privacy-preserving deep learning, known as Differential Private Stochastic Gradient Descent (DP-SGD), has met with limited success due to significant performance…

Machine Learning · Computer Science 2021-12-30 Jian Du , Haitao Mi

Differentially Private Optimization for Non-Decomposable Objective Functions

Unsupervised pre-training is a common step in developing computer vision models and large language models. In this setting, the absence of labels requires the use of similarity-based loss functions, such as contrastive loss, that favor…

Machine Learning · Computer Science 2025-02-21 Weiwei Kong , Andrés Muñoz Medina , Mónica Ribero

Sample-Efficient Differentially Private Fine-Tuning via Gradient Matrix Denoising

We address the challenge of sample efficiency in differentially private fine-tuning of large language models (LLMs) using DP-SGD. While DP-SGD provides strong privacy guarantees, the added noise significantly increases the entropy of…

Machine Learning · Computer Science 2026-01-12 Ali Dadsetan , Frank Rudzicz

A Closer Look at the Calibration of Differentially Private Learners

We systematically study the calibration of classifiers trained with differentially private stochastic gradient descent (DP-SGD) and observe miscalibration across a wide range of vision and language tasks. Our analysis identifies per-example…

Machine Learning · Computer Science 2022-11-16 Hanlin Zhang , Xuechen Li , Prithviraj Sen , Salim Roukos , Tatsunori Hashimoto

One size does not fit all: Investigating strategies for differentially-private learning across NLP tasks

Preserving privacy in contemporary NLP models allows us to work with sensitive data, but unfortunately comes at a price. We know that stricter privacy guarantees in differentially-private stochastic gradient descent (DP-SGD) generally…

Computation and Language · Computer Science 2023-02-01 Manuel Senge , Timour Igamberdiev , Ivan Habernal

Differentially Private Deep Learning with ModelMix

Training large neural networks with meaningful/usable differential privacy security guarantees is a demanding challenge. In this paper, we tackle this problem by revisiting the two key operations in Differentially Private Stochastic…

Machine Learning · Computer Science 2022-10-11 Hanshen Xiao , Jun Wan , Srinivas Devadas

Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors

In this paper, we explore the capacity of a language model-based method for grammatical error detection in detail. We first show that 5 to 10% of training data are enough for a BERT-based error detection method to achieve performance…

Computation and Language · Computer Science 2021-08-30 Ryo Nagata , Manabu Kimura , Kazuaki Hanawa

Fine-Tuning with Differential Privacy Necessitates an Additional Hyperparameter Search

Models need to be trained with privacy-preserving learning algorithms to prevent leakage of possibly sensitive information contained in their training data. However, canonical algorithms like differentially private stochastic gradient…

Machine Learning · Computer Science 2022-10-06 Yannis Cattan , Christopher A. Choquette-Choo , Nicolas Papernot , Abhradeep Thakurta

DP-Forward: Fine-tuning and Inference on Language Models with Differential Privacy in Forward Pass

Differentially private stochastic gradient descent (DP-SGD) adds noise to gradients in back-propagation, safeguarding training data from privacy leakage, particularly membership inference. It fails to cover (inference-time) threats like…

Cryptography and Security · Computer Science 2023-09-20 Minxin Du , Xiang Yue , Sherman S. M. Chow , Tianhao Wang , Chenyu Huang , Huan Sun

An Optimization Framework for Differentially Private Sparse Fine-Tuning

Differentially private stochastic gradient descent (DP-SGD) is broadly considered to be the gold standard for training and fine-tuning neural networks under differential privacy (DP). With the increasing availability of high-quality…

Machine Learning · Computer Science 2025-03-18 Mehdi Makni , Kayhan Behdin , Gabriel Afriat , Zheng Xu , Sergei Vassilvitskii , Natalia Ponomareva , Hussein Hazimeh , Rahul Mazumder

Differentially Private Parameter-Efficient Fine-tuning for Large ASR Models

Large ASR models can inadvertently leak sensitive information, which can be mitigated by formal privacy measures like differential privacy (DP). However, traditional DP training is computationally expensive, and can hurt model performance.…

Cryptography and Security · Computer Science 2024-10-04 Hongbin Liu , Lun Wang , Om Thakkar , Abhradeep Thakurta , Arun Narayanan