Related papers: Tuning Timestep-Distilled Diffusion Model Using Pa…

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Generating visually appealing images is fundamental to modern text-to-image generation models. A potential solution to better aesthetics is direct preference optimization (DPO), which has been applied to diffusion models to improve general…

Computer Vision and Pattern Recognition · Computer Science 2025-03-26 Zhanhao Liang , Yuhui Yuan , Shuyang Gu , Bohan Chen , Tiankai Hang , Mingxi Cheng , Ji Li , Liang Zheng

Diffusion Distillation With Direct Preference Optimization For Efficient 3D LiDAR Scene Completion

The application of diffusion models in 3D LiDAR scene completion is limited due to diffusion's slow sampling speed. Score distillation accelerates diffusion sampling but with performance degradation, while post-training with direct policy…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 An Zhao , Shengyuan Zhang , Ling Yang , Zejian Li , Jiale Wu , Haoran Xu , AnYang Wei , Perry Pengyun GU , Lingyun Sun

Direct Diffusion Score Preference Optimization via Stepwise Contrastive Policy-Pair Supervision

Diffusion models have achieved impressive results in generative tasks such as text-to-image synthesis, yet they often struggle to fully align outputs with nuanced user intent and maintain consistent aesthetic quality. Existing…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Dohyun Kim , Seungwoo Lyu , Seung Wook Kim , Paul Hongsuck Seo

D-OPSD: On-Policy Self-Distillation for Continuously Tuning Step-Distilled Diffusion Models

The landscape of high-performance image generation models is currently shifting from the inefficient multi-step ones to the efficient few-step counterparts (e.g, Z-Image-Turbo and FLUX.2-klein). However, these models present significant…

Computer Vision and Pattern Recognition · Computer Science 2026-05-27 Dengyang Jiang , Xin Jin , Dongyang Liu , Zanyi Wang , Mingzhe Zheng , Ruoyi Du , Xiangpeng Yang , Qilong Wu , Zhen Li , Peng Gao , Harry Yang , Steven Hoi

Multistep Distillation of Diffusion Models via Moment Matching

We present a new method for making diffusion models faster to sample. The method distills many-step diffusion models into few-step models by matching conditional expectations of the clean data given noisy data along the sampling trajectory.…

Machine Learning · Computer Science 2024-06-07 Tim Salimans , Thomas Mensink , Jonathan Heek , Emiel Hoogeboom

Posterior Distillation Sampling

We introduce Posterior Distillation Sampling (PDS), a novel optimization method for parametric image editing based on diffusion models. Existing optimization-based methods, which leverage the powerful 2D prior of diffusion models to handle…

Computer Vision and Pattern Recognition · Computer Science 2024-04-03 Juil Koo , Chanho Park , Minhyuk Sung

Scale-wise Distillation of Diffusion Models

Recent diffusion distillation methods have achieved remarkable progress, enabling high-quality ${\sim}4$-step sampling for large-scale text-conditional image and video diffusion models. However, further reducing the number of sampling steps…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Nikita Starodubcev , Ilya Drobyshevskiy , Denis Kuznedelev , Artem Babenko , Dmitry Baranchuk

SIPO: Stabilized and Improved Preference Optimization for Aligning Diffusion Models

Preference learning has garnered extensive attention as an effective technique for aligning diffusion models with human preferences in visual generation. However, existing alignment approaches such as Diffusion-DPO suffer from two…

Machine Learning · Computer Science 2026-05-19 Xiaomeng Yang , Mengping Yang , Junyan Wang , Zhijian Zhou , Zhiyu Tan , Hao Li

Inference-Time Diffusion Model Distillation

Diffusion distillation models effectively accelerate reverse sampling by compressing the process into fewer steps. However, these models still exhibit a performance gap compared to their pre-trained diffusion model counterparts, exacerbated…

Computer Vision and Pattern Recognition · Computer Science 2024-12-13 Geon Yeong Park , Sang Wan Lee , Jong Chul Ye

Diffusion Models Are Innate One-Step Generators

Diffusion Models (DMs) have achieved great success in image generation and other fields. By fine sampling through the trajectory defined by the SDE/ODE solver based on a well-trained score model, DMs can generate remarkable high-quality…

Computer Vision and Pattern Recognition · Computer Science 2024-06-10 Bowen Zheng , Tianming Yang

D-Fusion: Direct Preference Optimization for Aligning Diffusion Models with Visually Consistent Samples

The practical applications of diffusion models have been limited by the misalignment between generated images and corresponding text prompts. Recent studies have introduced direct preference optimization (DPO) to enhance the alignment of…

Computer Vision and Pattern Recognition · Computer Science 2025-05-29 Zijing Hu , Fengda Zhang , Kun Kuang

Single Trajectory Distillation for Accelerating Image and Video Style Transfer

Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A…

Computer Vision and Pattern Recognition · Computer Science 2024-12-30 Sijie Xu , Runqi Wang , Wei Zhu , Dejia Song , Nemo Chen , Xu Tang , Yao Hu

Smoothed Preference Optimization via ReNoise Inversion for Aligning Diffusion Models with Varied Human Preferences

Direct Preference Optimization (DPO) aligns text-to-image (T2I) generation models with human preferences using pairwise preference data. Although substantial resources are expended in collecting and labeling datasets, a critical aspect is…

Computer Vision and Pattern Recognition · Computer Science 2025-06-09 Yunhong Lu , Qichao Wang , Hengyuan Cao , Xiaoyin Xu , Min Zhang

Preference-Based Alignment of Discrete Diffusion Models

Diffusion models have achieved state-of-the-art performance across multiple domains, with recent advancements extending their applicability to discrete data. However, aligning discrete diffusion models with task-specific preferences remains…

Machine Learning · Computer Science 2025-04-10 Umberto Borso , Davide Paglieri , Jude Wells , Tim Rocktäschel

DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis

Diffusion models have demonstrated significant potential in speech synthesis tasks, including text-to-speech (TTS) and voice cloning. However, their iterative denoising processes are computationally intensive, and previous distillation…

Audio and Speech Processing · Electrical Eng. & Systems 2025-02-21 Yingahao Aaron Li , Rithesh Kumar , Zeyu Jin

Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation

Diffusion models can synthesize realistic co-speech video from audio for various applications, such as video creation and virtual agents. However, existing diffusion-based methods are slow due to numerous denoising steps and costly…

Computer Vision and Pattern Recognition · Computer Science 2025-10-06 Beijia Lu , Ziyi Chen , Jing Xiao , Jun-Yan Zhu

On Distillation of Guided Diffusion Models

Classifier-free guided diffusion models have recently been shown to be highly effective at high-resolution image generation, and they have been widely used in large-scale diffusion frameworks including DALLE-2, Stable Diffusion and Imagen.…

Computer Vision and Pattern Recognition · Computer Science 2023-04-14 Chenlin Meng , Robin Rombach , Ruiqi Gao , Diederik P. Kingma , Stefano Ermon , Jonathan Ho , Tim Salimans

CaO$_2$: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

The recent introduction of diffusion models in dataset distillation has shown promising potential in creating compact surrogate datasets for large, high-resolution target datasets, offering improved efficiency and performance over…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Haoxuan Wang , Zhenghao Zhao , Junyi Wu , Yuzhang Shang , Gaowen Liu , Yan Yan

DSFlow: Dual Supervision and Step-Aware Architecture for One-Step Flow Matching Speech Synthesis

Flow-matching models have enabled high-quality text-to-speech synthesis, but their iterative sampling process during inference incurs substantial computational cost. Although distillation is widely used to reduce the number of inference…

Sound · Computer Science 2026-02-11 Bin Lin , Peng Yang , Chao Yan , Xiaochen Liu , Wei Wang , Boyong Wu , Pengfei Tan , Xuerui Yang

SUDO: Enhancing Text-to-Image Diffusion Models with Self-Supervised Direct Preference Optimization

Previous text-to-image diffusion models typically employ supervised fine-tuning (SFT) to enhance pre-trained base models. However, this approach primarily minimizes the loss of mean squared error (MSE) at the pixel level, neglecting the…

Computer Vision and Pattern Recognition · Computer Science 2025-04-22 Liang Peng , Boxi Wu , Haoran Cheng , Yibo Zhao , Xiaofei He