English
Related papers

Related papers: Extracting Training Data from Large Language Model…

200 papers

Pre-trained large language models, such as GPT\nobreakdash-2 and BERT, are often fine-tuned to achieve state-of-the-art performance on a downstream task. One natural example is the ``Smart Reply'' application where a pre-trained model is…

Cryptography and Security · Computer Science 2023-09-06 Bargav Jayaraman , Esha Ghosh , Melissa Chase , Sambuddha Roy , Wei Dai , David Evans

With the advance of language models, privacy protection is receiving more attention. Training data extraction is therefore of great importance, as it can serve as a potential tool to assess privacy leakage. However, due to the difficulty of…

Computation and Language · Computer Science 2023-06-02 Weichen Yu , Tianyu Pang , Qian Liu , Chao Du , Bingyi Kang , Yan Huang , Min Lin , Shuicheng Yan

Large language models have gained significant popularity because of their ability to generate human-like text and potential applications in various fields, such as Software Engineering. Large language models for code are commonly trained on…

Cryptography and Security · Computer Science 2024-01-17 Ali Al-Kaswan , Maliheh Izadi , Arie van Deursen

When large language models are trained on private data, it can be a significant privacy risk for them to memorize and regurgitate sensitive information. In this work, we propose a new practical data extraction attack that we call "neural…

Cryptography and Security · Computer Science 2024-03-05 Ashwinee Panda , Christopher A. Choquette-Choo , Zhengming Zhang , Yaoqing Yang , Prateek Mittal

This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of…

Language models are prone to memorizing their training data, making them vulnerable to extraction attacks. While existing research often examines isolated setups, such as a single model or a fixed prompt, real-world adversaries have a…

Cryptography and Security · Computer Science 2025-08-11 Yash More , Prakhar Ganesh , Golnoosh Farnadi

Model extraction attacks pose significant security threats to deployed language models, potentially compromising intellectual property and user privacy. This survey provides a comprehensive taxonomy of LLM-specific extraction attacks and…

Cryptography and Security · Computer Science 2025-07-09 Kaixiang Zhao , Lincan Li , Kaize Ding , Neil Zhenqiang Gong , Yue Zhao , Yushun Dong

Previous work has shown that Large Language Models are susceptible to so-called data extraction attacks. This allows an attacker to extract a sample that was contained in the training data, which has massive privacy implications. The…

Computation and Language · Computer Science 2023-02-16 Ali Al-Kaswan , Maliheh Izadi , Arie van Deursen

Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set. In this work, we show that the…

Cryptography and Security · Computer Science 2022-12-21 Nikhil Kandpal , Eric Wallace , Colin Raffel

High-quality training data has proven crucial for developing performant large language models (LLMs). However, commercial LLM providers disclose few, if any, details about the data used for training. This lack of transparency creates…

The text generated by large language models is commonly controlled by prompting, where a prompt prepended to a user's query guides the model's output. The prompts used by companies to guide their models are often treated as secrets, to be…

Computation and Language · Computer Science 2024-08-09 Yiming Zhang , Nicholas Carlini , Daphne Ippolito

Machine learning models are shown to face a severe threat from Model Extraction Attacks, where a well-trained private model owned by a service provider can be stolen by an attacker pretending as a client. Unfortunately, prior works focus on…

Machine Learning · Computer Science 2021-12-02 Bang Wu , Xiangwen Yang , Shirui Pan , Xingliang Yuan

Recent advances in neural network based language models lead to successful deployments of such models, improving user experience in various applications. It has been demonstrated that strong performance of language models comes along with…

Cryptography and Security · Computer Science 2021-02-24 Huseyin A. Inan , Osman Ramadan , Lukas Wutschitz , Daniel Jones , Victor Rühle , James Withers , Robert Sim

Neural networks are often trained on proprietary datasets, making them attractive attack targets. We present a novel dataset extraction method leveraging an innovative training time backdoor attack, allowing a malicious federated learning…

Cryptography and Security · Computer Science 2025-12-19 Eden Luzon , Guy Amit , Roy Weiss , Torsten Kraub , Alexandra Dmitrienko , Yisroel Mirsky

It is perhaps no longer surprising that machine learning models, especially deep neural networks, are particularly vulnerable to attacks. One such vulnerability that has been well studied is model extraction: a phenomenon in which the…

Cryptography and Security · Computer Science 2022-07-27 Tejumade Afonja , Lucas Bourtoule , Varun Chandrasekaran , Sageev Oore , Nicolas Papernot

As the deployment of pre-trained language models (PLMs) expands, pressing security concerns have arisen regarding the potential for malicious extraction of training data, posing a threat to data privacy. This study is the first to provide a…

Computation and Language · Computer Science 2023-05-26 Shotaro Ishihara

Large Language Models (LLMs) are known to memorize significant portions of their training data. Parts of this memorized content have been shown to be extractable by simply querying the model, which poses a privacy risk. We present a novel…

Computation and Language · Computer Science 2023-05-22 Mustafa Safa Ozdayi , Charith Peris , Jack FitzGerald , Christophe Dupuy , Jimit Majmudar , Haidar Khan , Rahil Parikh , Rahul Gupta

Recent data-extraction attacks have exposed that language models can memorize some training samples verbatim. This is a vulnerability that can compromise the privacy of the model's training data. In this work, we introduce SubMix: a…

Machine Learning · Computer Science 2022-01-05 Antonio Ginart , Laurens van der Maaten , James Zou , Chuan Guo

The collection and availability of big data, combined with advances in pre-trained models (e.g. BERT), have revolutionized the predictive performance of natural language processing tasks. This allows corporations to provide machine learning…

Cryptography and Security · Computer Science 2022-11-01 Xuanli He , Chen Chen , Lingjuan Lyu , Qiongkai Xu

Large Language Models (LLMs) have been widely adopted to enhance Task-Oriented Dialogue Systems (TODS) by modeling complex language patterns and delivering contextually appropriate responses. However, this integration introduces significant…

Computation and Language · Computer Science 2026-03-05 Shuo Zhang , Junzhou Zhao , Junji Hou , Pinghui Wang , Chenxu Wang , Jing Tao
‹ Prev 1 2 3 10 Next ›