Related papers: FreeInit: Bridging Initialization Gap in Video Dif…

FastInit: Fast Noise Initialization for Temporally Consistent Video Generation

Video generation has made significant strides with the development of diffusion models; however, achieving high temporal consistency remains a challenging task. Recently, FreeInit identified a training-inference gap and introduced a method…

Computer Vision and Pattern Recognition · Computer Science 2025-08-06 Chengyu Bai , Yuming Li , Zhongyu Zhao , Jintao Chen , Peidong Jia , Qi She , Ming Lu , Shanghang Zhang

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling

With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress. However, existing video generation models are typically trained on a limited number of…

Computer Vision and Pattern Recognition · Computer Science 2024-01-31 Haonan Qiu , Menghan Xia , Yong Zhang , Yingqing He , Xintao Wang , Ying Shan , Ziwei Liu

FreqPrior: Improving Video Diffusion Models with Frequency Filtering Gaussian Noise

Text-driven video generation has advanced significantly due to developments in diffusion models. Beyond the training and sampling phases, recent studies have investigated noise priors of diffusion models, as improved noise priors yield…

Image and Video Processing · Electrical Eng. & Systems 2025-02-20 Yunlong Yuan , Yuanfan Guo , Chunwei Wang , Wei Zhang , Hang Xu , Li Zhang

Can Diffusion Model Achieve Better Performance in Text Generation? Bridging the Gap between Training and Inference!

Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space. However, there exist nonnegligible gaps between training and inference, owing to the absence of the forward…

Computation and Language · Computer Science 2023-05-09 Zecheng Tang , Pinzheng Wang , Keyan Zhou , Juntao Li , Ziqiang Cao , Min Zhang

InitNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

Recent strides in the development of diffusion models, exemplified by advancements such as Stable Diffusion, have underscored their remarkable prowess in generating visually compelling images. However, the imperative of achieving a seamless…

Computer Vision and Pattern Recognition · Computer Science 2024-04-09 Xiefan Guo , Jinlin Liu , Miaomiao Cui , Jiankai Li , Hongyu Yang , Di Huang

FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models

In recent years, large-scale pre-trained diffusion models have demonstrated their outstanding capabilities in image and video generation tasks. However, existing models tend to produce visual objects commonly found in the training dataset,…

Computer Vision and Pattern Recognition · Computer Science 2024-07-30 Changgu Chen , Libing Yang , Xiaoyan Yang , Lianggangxu Chen , Gaoqi He , CHangbo Wang , Yang Li

Training-free Diffusion Acceleration with Bottleneck Sampling

Diffusion models have demonstrated remarkable capabilities in visual content generation but remain challenging to deploy due to their high computational cost during inference. This computational burden primarily arises from the quadratic…

Computer Vision and Pattern Recognition · Computer Science 2025-03-28 Ye Tian , Xin Xia , Yuxi Ren , Shanchuan Lin , Xing Wang , Xuefeng Xiao , Yunhai Tong , Ling Yang , Bin Cui

Noise Crystallization and Liquid Noise: Zero-shot Video Generation using Image Diffusion Models

Although powerful for image generation, consistent and controllable video is a longstanding problem for diffusion models. Video models require extensive training and computational resources, leading to high costs and large environmental…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Muhammad Haaris Khan , Hadrien Reynaud , Bernhard Kainz

Preserving Image Properties Through Initializations in Diffusion Models

Retail photography imposes specific requirements on images. For instance, images may need uniform background colors, consistent model poses, centered products, and consistent lighting. Minor deviations from these standards impact a site's…

Computer Vision and Pattern Recognition · Computer Science 2024-01-05 Jeffrey Zhang , Shao-Yu Chang , Kedan Li , David Forsyth

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Video diffusion models (VDMs) facilitate the generation of high-quality videos, with current research predominantly concentrated on scaling efforts during training through improvements in data quality, computational resources, and model…

Machine Learning · Computer Science 2025-05-27 Haolin Yang , Feilong Tang , Ming Hu , Qingyu Yin , Yulong Li , Yexin Liu , Zelin Peng , Peng Gao , Junjun He , Zongyuan Ge , Imran Razzak

Towards Faster Training of Diffusion Models: An Inspiration of A Consistency Phenomenon

Diffusion models (DMs) are a powerful generative framework that have attracted significant attention in recent years. However, the high computational cost of training DMs limits their practical applications. In this paper, we start with a…

Machine Learning · Computer Science 2024-04-12 Tianshuo Xu , Peng Mi , Ruilin Wang , Yingcong Chen

FreeEnhance: Tuning-Free Image Enhancement via Content-Consistent Noising-and-Denoising Process

The emergence of text-to-image generation models has led to the recognition that image enhancement, performed as post-processing, would significantly improve the visual quality of the generated images. Exploring diffusion models to enhance…

Computer Vision and Pattern Recognition · Computer Science 2024-09-12 Yang Luo , Yiheng Zhang , Zhaofan Qiu , Ting Yao , Zhineng Chen , Yu-Gang Jiang , Tao Mei

Adjusting Initial Noise to Mitigate Memorization in Text-to-Image Diffusion Models

Despite their impressive generative capabilities, text-to-image diffusion models often memorize and replicate training data, prompting serious concerns over privacy and copyright. Recent work has attributed this memorization to an…

Computer Vision and Pattern Recognition · Computer Science 2025-10-13 Hyeonggeun Han , Sehwan Kim , Hyungjun Joo , Sangwoo Hong , Jungwoo Lee

DiffusionVMR: Diffusion Model for Joint Video Moment Retrieval and Highlight Detection

Video moment retrieval and highlight detection have received attention in the current era of video content proliferation, aiming to localize moments and estimate clip relevances based on user-specific queries. Given that the video content…

Computer Vision and Pattern Recognition · Computer Science 2024-03-05 Henghao Zhao , Kevin Qinghong Lin , Rui Yan , Zechao Li

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Generative models have made significant impacts across various domains, largely due to their ability to scale during training by increasing data, computational resources, and model size, a phenomenon characterized by the scaling laws.…

Computer Vision and Pattern Recognition · Computer Science 2025-01-17 Nanye Ma , Shangyuan Tong , Haolin Jia , Hexiang Hu , Yu-Chuan Su , Mingda Zhang , Xuan Yang , Yandong Li , Tommi Jaakkola , Xuhui Jia , Saining Xie

Noise Aggregation Analysis Driven by Small-Noise Injection: Efficient Membership Inference for Diffusion Models

Diffusion models have demonstrated powerful performance in generating high-quality images. A typical example is text-to-image generator like Stable Diffusion. However, their widespread use also poses potential privacy risks. A key concern…

Computer Vision and Pattern Recognition · Computer Science 2026-04-20 Guo Li , Weihong Chen , Yongfu Fan

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Diffusion models have recently achieved great success in the synthesis of high-quality images and videos. However, the existing denoising techniques in diffusion models are commonly based on step-by-step noise predictions, which suffers…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Hancheng Ye , Jiakang Yuan , Renqiu Xia , Xiangchao Yan , Tao Chen , Junchi Yan , Botian Shi , Bo Zhang

Video Diffusion Models

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Jonathan Ho , Tim Salimans , Alexey Gritsenko , William Chan , Mohammad Norouzi , David J. Fleet

How I Warped Your Noise: a Temporally-Correlated Noise Prior for Diffusion Models

Video editing and generation methods often rely on pre-trained image-based diffusion models. During the diffusion process, however, the reliance on rudimentary noise sampling techniques that do not preserve correlations present in…

Computer Vision and Pattern Recognition · Computer Science 2025-04-07 Pascal Chang , Jingwei Tang , Markus Gross , Vinicius C. Azevedo

DBINDS -- Can Initial Noise from Diffusion Model Inversion Help Reveal AI-Generated Videos?

AI-generated video has advanced rapidly and poses serious challenges to content security and forensic analysis. Existing detectors rely mainly on pixel-level visual cues and generalize poorly to unseen generators. We propose DBINDS, a…

Computer Vision and Pattern Recognition · Computer Science 2025-11-13 Yanlin Wu , Xiaogang Yuan , Dezhi An