English
Related papers

Related papers: Extracting Training Data from Unconditional Diffus…

200 papers

As diffusion probabilistic models (DPMs) become central to Generative AI (GenAI), understanding their memorization behavior is essential for evaluating risks such as data leakage, copyright infringement, and trustworthiness. While prior…

Machine Learning · Computer Science 2025-08-04 Yunhao Chen , Shujie Wang , Difan Zou , Xingjun Ma

Due to their capacity to generate novel and high-quality samples, diffusion models have attracted significant research interest in recent years. Notably, the typical training objective of diffusion models, i.e., denoising score matching,…

Machine Learning · Computer Science 2025-02-21 Xiangming Gu , Chao Du , Tianyu Pang , Chongxuan Li , Min Lin , Ye Wang

When do diffusion models reproduce their training data, and when are they able to generate samples beyond it? A practically relevant theoretical understanding of this interplay between memorization and generalization may significantly…

Machine Learning · Computer Science 2025-08-26 Sam Buchanan , Druv Pai , Yi Ma , Valentin De Bortoli

Diffusion models, widely used for image and video generation, face a significant limitation: the risk of memorizing and reproducing training data during inference, potentially generating unauthorized copyrighted content. While prior…

Computer Vision and Pattern Recognition · Computer Science 2025-04-28 Chen Chen , Enhuai Liu , Daochang Liu , Mubarak Shah , Chang Xu

Autoregressive language models (ARMs) have been shown to memorize and occasionally reproduce training data verbatim, raising concerns about privacy and copyright liability. Diffusion language models (DLMs) have recently emerged as a…

Computation and Language · Computer Science 2026-03-04 Xiaoyu Luo , Wenrui Yu , Qiongxiu Li , Johannes Bjerva

Recent breakthroughs in diffusion models have exhibited exceptional image-generation capabilities. However, studies show that some outputs are merely replications of training data. Such replications present potential legal challenges for…

Computer Vision and Pattern Recognition · Computer Science 2024-08-01 Yuxin Wen , Yuchen Liu , Chen Chen , Lingjuan Lyu

AI models present a wide range of applications in the field of medicine. However, achieving optimal performance requires access to extensive healthcare data, which is often not readily available. Furthermore, the imperative to preserve…

Pretrained diffusion models and their outputs are widely accessible due to their exceptional capacity for synthesizing high-quality images and their open-source nature. The users, however, may face litigation risks owing to the models'…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Chen Chen , Daochang Liu , Chang Xu

Large-scale text-to-image diffusion models excel in generating high-quality images from textual inputs, yet concerns arise as research indicates their tendency to memorize and replicate training data, raising We also addressed the issue of…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Ruchika Chavhan , Ondrej Bohdal , Yongshuo Zong , Da Li , Timothy Hospedales

Diffusion models have shown strong performance in generating high-quality tabular data, but they carry privacy risks by reproducing exact training samples. While prior work focuses on dataset-level augmentation to reduce memorization,…

Machine Learning · Computer Science 2026-05-26 Zhengyu Fang , Zhimeng Jiang , Huiyuan Chen , Xiaoge Zhang , Kaiyu Tang , Xiao Li , Jing Li

Memorization in large-scale text-to-image diffusion models poses significant security and intellectual property risks, enabling adversarial attribute extraction and the unauthorized reproduction of sensitive or proprietary features. While…

Machine Learning · Computer Science 2026-01-28 Divya Kothandaraman , Jaclyn Pytlarz

Diffusion-based models, such as the Stable Diffusion model, have revolutionized text-to-image synthesis with their ability to produce high-quality, high-resolution images. These advancements have prompted significant progress in image…

Cryptography and Security · Computer Science 2023-12-07 Ali Naseh , Jaechul Roh , Amir Houmansadr

Diffusion probabilistic models have become a cornerstone of modern generative AI, yet the mechanisms underlying their generalization remain poorly understood. In fact, if these models were perfectly minimizing their training loss, they…

Machine Learning · Computer Science 2025-09-03 Alessandro Favero , Antonio Sclocchi , Matthieu Wyart

When do language diffusion models memorize their training data, and how to quantitatively assess their true generative regime? We address these questions by showing that Uniform-based Discrete Diffusion Models (UDDMs) fundamentally behave…

Machine Learning · Computer Science 2026-04-30 Bao Pham , Mohammed J. Zaki , Luca Ambrogioni , Dmitry Krotov , Matteo Negri

The proliferation of diffusion models trained on web-scale, provenance-uncertain image collections has made it essential, yet technically unresolved, to determine whether a model has learned from specific copyrighted data without…

Machine Learning · Computer Science 2026-04-06 Muxing Li , Zesheng Ye , Sharon Li , Andy Song , Guangquan Zhang , Feng Liu

Diffusion models (DMs) memorize training images and can reproduce near-duplicates during generation. Current detection methods identify verbatim memorization but fail to capture two critical aspects: quantifying partial memorization…

Computer Vision and Pattern Recognition · Computer Science 2025-08-19 Jimmy Z. Di , Yiwei Lu , Yaoliang Yu , Gautam Kamath , Adam Dziedzic , Franziska Boenisch

Memorization in large language models has been studied almost exclusively through prefix-conditioned extraction, a natural choice for autoregressive models. However, diffusion language models (DLMs) can denoise masked tokens at arbitrary…

Computation and Language · Computer Science 2026-05-26 Yihan Wang , N. Asokan

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual…

Cryptography and Security · Computer Science 2023-01-31 Nicholas Carlini , Jamie Hayes , Milad Nasr , Matthew Jagielski , Vikash Sehwag , Florian Tramèr , Borja Balle , Daphne Ippolito , Eric Wallace

In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility": given the same starting noise input and a deterministic sampler, different diffusion models often…

Machine Learning · Computer Science 2024-06-11 Huijie Zhang , Jinfan Zhou , Yifu Lu , Minzhe Guo , Peng Wang , Liyue Shen , Qing Qu

We introduce Compartmentalized Diffusion Models (CDM), a method to train different diffusion models (or prompts) on distinct data sources and arbitrarily compose them at inference time. The individual models can be trained in isolation, at…

Machine Learning · Computer Science 2024-10-15 Aditya Golatkar , Alessandro Achille , Ashwin Swaminathan , Stefano Soatto
‹ Prev 1 2 3 10 Next ›