Related papers: Smoothing the Score Function for Generalization in…

On the Interpolation Effect of Score Smoothing in Diffusion Models

Diffusion models have achieved remarkable progress in various domains with an intriguing ability to produce new data that do not exist in the training set. In this work, we study the hypothesis that such creativity arises from the neural…

Machine Learning · Computer Science 2026-04-21 Zhengdao Chen

Denoising Score Matching with Random Features: Insights on Diffusion Models from Precise Learning Curves

We theoretically investigate the phenomena of generalization and memorization in diffusion models. Empirical studies suggest that these phenomena are influenced by model complexity and the size of the training dataset. In our experiments,…

Machine Learning · Computer Science 2025-10-09 Anand Jerry George , Rodrigo Veiga , Nicolas Macris

Diffusion Models and the Manifold Hypothesis: Log-Domain Smoothing is Geometry Adaptive

Diffusion models have achieved state-of-the-art performance, demonstrating remarkable generalisation capabilities across diverse domains. However, the mechanisms underpinning these strong capabilities remain only partially understood. A…

Machine Learning · Computer Science 2025-10-03 Tyler Farghly , Peter Potaptchik , Samuel Howard , George Deligiannidis , Jakiw Pidstrigach

Deep Networks as Denoising Algorithms: Sample-Efficient Learning of Diffusion Models in High-Dimensional Graphical Models

We investigate the approximation efficiency of score functions by deep neural networks in diffusion-based generative modeling. While existing approximation theories utilize the smoothness of score functions, they suffer from the curse of…

Machine Learning · Computer Science 2023-09-21 Song Mei , Yuchen Wu

Generalization through variance: how noise shapes inductive biases in diffusion models

How diffusion models generalize beyond their training set is not known, and is somewhat mysterious given two facts: the optimum of the denoising score matching (DSM) objective usually used to train diffusion models is the score function of…

Machine Learning · Computer Science 2025-04-18 John J. Vastola

Implicit Regularisation in Diffusion Models: An Algorithm-Dependent Generalisation Analysis

The success of denoising diffusion models raises important questions regarding their generalisation behaviour, particularly in high-dimensional settings. Notably, it has been shown that when training and sampling are performed perfectly,…

Machine Learning · Statistics 2025-07-08 Tyler Farghly , Patrick Rebeschini , George Deligiannidis , Arnaud Doucet

Provable Separations between Memorization and Generalization in Diffusion Models

Diffusion models have achieved remarkable success across diverse domains, but they remain vulnerable to memorization -- reproducing training data rather than generating novel outputs. This not only limits their creative potential but also…

Machine Learning · Statistics 2025-11-10 Zeqi Ye , Qijie Zhu , Molei Tao , Minshuo Chen

Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study

Diffusion models now set the benchmark in high-fidelity generative sampling, yet they can, in principle, be prone to memorization. In this case, their learned score overfits the finite dataset so that the reverse-time SDE samples are mostly…

Machine Learning · Computer Science 2025-05-30 Franck Gabriel , François Ged , Maria Han Veiga , Emmanuel Schertzer

Manifold Generalization Provably Proceeds Memorization in Diffusion Models

Diffusion models often generate novel samples even when the learned score is only \emph{coarse} -- a phenomenon not accounted for by the standard view of diffusion training as density estimation. In this paper, we show that, under the…

Machine Learning · Computer Science 2026-03-26 Zebang Shen , Ya-Ping Hsieh , Niao He

The Unreasonable Effectiveness of Gaussian Score Approximation for Diffusion Models and its Applications

By learning the gradient of smoothed data distributions, diffusion models can iteratively generate samples from complex distributions. The learned score function enables their generalization capabilities, but how the learned score relates…

Machine Learning · Computer Science 2024-12-16 Binxu Wang , John J. Vastola

Memorization and Regularization in Generative Diffusion Models

Diffusion models have emerged as a powerful framework for generative modeling. At the heart of the methodology is score matching: learning gradients of families of log-densities for noisy versions of the data distribution at different…

Machine Learning · Computer Science 2025-03-19 Ricardo Baptista , Agnimitra Dasgupta , Nikola B. Kovachki , Assad Oberai , Andrew M. Stuart

Training-Free Generative Sampling via Moment-Matched Score Smoothing

Diffusion models generate samples by denoising along the score of a perturbed target distribution. In practice, one trains a neural diffusion model, which is computationally expensive. Recent work suggests that score matching implicitly…

Machine Learning · Statistics 2026-05-15 Zhenyu Yao , Daniel Paulin

Temporal Score Rescaling for Temperature Sampling in Diffusion and Flow Models

We present a mechanism to steer the sampling diversity of denoising diffusion and flow matching models, allowing users to sample from a sharper or broader distribution than the training distribution. We build on the observation that these…

Machine Learning · Computer Science 2026-05-26 Yanbo Xu , Yu Wu , Sungjae Park , Zhizhuo Zhou , Shubham Tulsiani

Selective Underfitting in Diffusion Models

Diffusion models have emerged as the principal paradigm for generative modeling across various domains. During training, they learn the score function, which in turn is used to generate samples at inference. They raise a basic yet unsolved…

Machine Learning · Computer Science 2025-10-03 Kiwhan Song , Jaeyeon Kim , Sitan Chen , Yilun Du , Sham Kakade , Vincent Sitzmann

Deterministic Dynamics of Sampling Processes in Score-Based Diffusion Models with Multiplicative Noise Conditioning

Score-based diffusion models generate new samples by learning the score function associated with a diffusion process. While the effectiveness of these models can be theoretically explained using differential equations related to the…

Machine Learning · Computer Science 2026-01-21 Doheon Kim

Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models,…

Machine Learning · Computer Science 2023-02-15 Minshuo Chen , Kaixuan Huang , Tuo Zhao , Mengdi Wang

Diffusion Models Memorize in Training -- and Generalize in Inference

Diffusion models generalize well in practice. However, an optimal diffusion model fully memorizes the training data and therefore fails to generalize, raising the question of what induces generalization in a real diffusion model. We show…

Machine Learning · Computer Science 2026-05-21 Tim Kaiser , Markus Kollmann

From optimal score matching to optimal sampling

The recent, impressive advances in algorithmic generation of high-fidelity image, audio, and video are largely due to great successes in score-based diffusion models. A key implementing step is score matching, that is, the estimation of the…

Machine Learning · Statistics 2024-09-12 Zehao Dou , Subhodh Kotekal , Zhehao Xu , Harrison H. Zhou

Dimension-free Score Matching and Time Bootstrapping for Diffusion Models

Diffusion models generate samples by estimating the score function of the target distribution at various noise levels. The model is trained using samples drawn from the target distribution by progressively adding noise. Previous sample…

Machine Learning · Computer Science 2025-10-28 Syamantak Kumar , Dheeraj Nagaraj , Purnamrita Sarkar

Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure

In this work, we study the generalizability of diffusion models by looking into the hidden properties of the learned score functions, which are essentially a series of deep denoisers trained on various noise levels. We observe that as…

Machine Learning · Computer Science 2024-12-03 Xiang Li , Yixiang Dai , Qing Qu