Related papers: How Diffusion Models Learn to Factorize and Compos…

Do Diffusion Models Learn Semantically Meaningful and Efficient Representations?

Diffusion models are capable of impressive feats of image generation with uncommon juxtapositions such as astronauts riding horses on the moon with properly placed shadows. These outputs indicate the ability to perform compositional…

Machine Learning · Computer Science 2024-05-01 Qiyao Liang , Ziming Liu , Ila Fiete

When Do Diffusion Models learn to Generate Multiple Objects?

Text-to-image diffusion models achieve impressive visual fidelity, yet they remain unreliable in multi-object generation. Despite extensive empirical evidence of these failures, the underlying causes remain unclear. We begin by asking how…

Computer Vision and Pattern Recognition · Computer Science 2026-05-04 Yujin Jeong , Arnas Uselis , Iro Laina , Seong Joon Oh , Anna Rohrbach

Compositional Generalization via Forced Rendering of Disentangled Latents

Composition-the ability to generate myriad variations from finite means-is believed to underlie powerful generalization. However, compositional generalization remains a key challenge for deep learning. A widely held assumption is that…

Machine Learning · Computer Science 2025-05-27 Qiyao Liang , Daoyuan Qian , Liu Ziyin , Ila Fiete

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhengxiong Luo , Dayou Chen , Yingya Zhang , Yan Huang , Liang Wang , Yujun Shen , Deli Zhao , Jingren Zhou , Tieniu Tan

How Compositional Generalization and Creativity Improve as Diffusion Models are Trained

Natural data is often organized as a hierarchical composition of features. How many samples do generative models need in order to learn the composition rules, so as to produce a combinatorially large number of novel data? What signal in the…

Machine Learning · Statistics 2025-06-05 Alessandro Favero , Antonio Sclocchi , Francesco Cagnetta , Pascal Frossard , Matthieu Wyart

Unite and Conquer: Plug & Play Multi-Modal Synthesis using Diffusion Models

Generating photos satisfying multiple constraints find broad utility in the content creation industry. A key hurdle to accomplishing this task is the need for paired data consisting of all modalities (i.e., constraints) and their…

Computer Vision and Pattern Recognition · Computer Science 2023-04-21 Nithin Gopalakrishnan Nair , Wele Gedara Chaminda Bandara , Vishal M. Patel

Compositional Image Decomposition with Diffusion Models

Given an image of a natural scene, we are able to quickly decompose it into a set of components such as objects, lighting, shadows, and foreground. We can then envision a scene where we combine certain components with those from other…

Computer Vision and Pattern Recognition · Computer Science 2024-06-28 Jocelin Su , Nan Liu , Yanbo Wang , Joshua B. Tenenbaum , Yilun Du

Compositional Abilities Emerge Multiplicatively: Exploring Diffusion Models on a Synthetic Task

Modern generative models exhibit unprecedented capabilities to generate extremely realistic data. However, given the inherent compositionality of the real world, reliable use of these models in practical applications requires that they…

Machine Learning · Computer Science 2025-07-29 Maya Okawa , Ekdeep Singh Lubana , Robert P. Dick , Hidenori Tanaka

Diffusion Model for Manifold Data: Score Decomposition, Curvature, and Statistical Complexity

Diffusion models have become a leading framework in generative modeling, yet their theoretical understanding -- especially for high-dimensional data concentrated on low-dimensional structures -- remains incomplete. This paper investigates…

Machine Learning · Computer Science 2026-04-29 Zixuan Zhang , Kaixuan Huang , Tuo Zhao , Mengdi Wang , Minshuo Chen

Bring the Power of Diffusion Model to Defect Detection

Due to the high complexity and technical requirements of industrial production processes, surface defects will inevitably appear, which seriously affects the quality of products. Although existing lightweight detection networks are highly…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Xuyi Yu

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Diffusion models, though originally designed for generative tasks, have demonstrated impressive self-supervised representation learning capabilities. A particularly intriguing phenomenon in these models is the emergence of unimodal…

Machine Learning · Computer Science 2026-02-04 Xiao Li , Zekai Zhang , Xiang Li , Siyi Chen , Zhihui Zhu , Peng Wang , Qing Qu

On the Feature Learning in Diffusion Models

The predominant success of diffusion models in generative modeling has spurred significant interest in understanding their theoretical foundations. In this work, we propose a feature learning framework aimed at analyzing and comparing the…

Machine Learning · Statistics 2025-03-04 Andi Han , Wei Huang , Yuan Cao , Difan Zou

A Phase Transition in Diffusion Models Reveals the Hierarchical Nature of Data

Understanding the structure of real data is paramount in advancing modern deep-learning methodologies. Natural data such as images are believed to be composed of features organized in a hierarchical and combinatorial manner, which neural…

Machine Learning · Statistics 2024-12-25 Antonio Sclocchi , Alessandro Favero , Matthieu Wyart

Shaping Inductive Bias in Diffusion Models through Frequency-Based Noise Control

Diffusion Probabilistic Models (DPMs) are powerful generative models that have achieved unparalleled success in a number of generative tasks. In this work, we aim to build inductive biases into the training and sampling of diffusion models…

Machine Learning · Computer Science 2025-03-14 Thomas Jiralerspong , Berton Earnshaw , Jason Hartford , Yoshua Bengio , Luca Scimeca

The Emergence of Reproducibility and Generalizability in Diffusion Models

In this work, we investigate an intriguing and prevalent phenomenon of diffusion models which we term as "consistent model reproducibility": given the same starting noise input and a deterministic sampler, different diffusion models often…

Machine Learning · Computer Science 2024-06-11 Huijie Zhang , Jinfan Zhou , Yifu Lu , Minzhe Guo , Peng Wang , Liyue Shen , Qing Qu

Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure

In this work, we study the generalizability of diffusion models by looking into the hidden properties of the learned score functions, which are essentially a series of deep denoisers trained on various noise levels. We observe that as…

Machine Learning · Computer Science 2024-12-03 Xiang Li , Yixiang Dai , Qing Qu

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

Diffusion probabilistic models (DPMs) have achieved remarkable quality in image generation that rivals GANs'. But unlike GANs, DPMs use a set of latent variables that lack semantic meaning and cannot serve as a useful representation for…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Konpat Preechakul , Nattanat Chatthee , Suttisak Wizadwongsa , Supasorn Suwajanakorn

Generalization of Diffusion Models Arises with a Balanced Representation Space

Diffusion models excel at generating high-quality, diverse samples, yet they risk memorizing training data when overfit to the training objective. We analyze the distinctions between memorization and generalization in diffusion models…

Machine Learning · Computer Science 2026-02-12 Zekai Zhang , Xiao Li , Xiang Li , Lianghe Shi , Meng Wu , Molei Tao , Qing Qu

Compositional Visual Generation with Composable Diffusion Models

Large text-guided diffusion models, such as DALLE-2, are able to generate stunning photorealistic images given natural language descriptions. While such models are highly flexible, they struggle to understand the composition of certain…

Computer Vision and Pattern Recognition · Computer Science 2023-01-18 Nan Liu , Shuang Li , Yilun Du , Antonio Torralba , Joshua B. Tenenbaum

Diffusion Model as Representation Learner

Diffusion Probabilistic Models (DPMs) have recently demonstrated impressive results on various generative tasks.Despite its promises, the learned representations of pre-trained DPMs, however, have not been fully understood. In this paper,…

Computer Vision and Pattern Recognition · Computer Science 2023-08-23 Xingyi Yang , Xinchao Wang