Related papers: Video Diffusion Models

Video Diffusion Models: A Survey

Diffusion generative models have recently become a powerful technique for creating and modifying high-quality, coherent video content. This survey provides a comprehensive overview of the critical components of diffusion models for video…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Andrew Melnik , Michal Ljubljanac , Cong Lu , Qi Yan , Weiming Ren , Helge Ritter

Latent Video Diffusion Models for High-Fidelity Long Video Generation

AI-generated content has attracted lots of attention recently, but photo-realistic video synthesis is still challenging. Although many attempts using GANs and autoregressive models have been made in this area, the visual quality and length…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Yingqing He , Tianyu Yang , Yong Zhang , Ying Shan , Qifeng Chen

FlexiFilm: Long Video Generation with Flexible Conditions

Generating long and consistent videos has emerged as a significant yet challenging problem. While most existing diffusion-based video generation models, derived from image generation models, demonstrate promising performance in generating…

Computer Vision and Pattern Recognition · Computer Science 2024-04-30 Yichen Ouyang , jianhao Yuan , Hao Zhao , Gaoang Wang , Bo zhao

Imagen Video: High Definition Video Generation with Diffusion Models

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of…

Computer Vision and Pattern Recognition · Computer Science 2022-10-06 Jonathan Ho , William Chan , Chitwan Saharia , Jay Whang , Ruiqi Gao , Alexey Gritsenko , Diederik P. Kingma , Ben Poole , Mohammad Norouzi , David J. Fleet , Tim Salimans

VIDM: Video Implicit Diffusion Models

Diffusion models have emerged as a powerful generative method for synthesizing high-quality and diverse set of images. In this paper, we propose a video generation method based on diffusion models, where the effects of motion are modeled in…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Kangfu Mei , Vishal M. Patel

From Image to Video: An Empirical Study of Diffusion Representations

Diffusion models have revolutionized generative modeling, enabling unprecedented realism in image and video synthesis. This success has sparked interest in leveraging their representations for visual understanding tasks. While recent works…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Pedro Vélez , Luisa F. Polanía , Yi Yang , Chuhan Zhang , Rishabh Kabra , Anurag Arnab , Mehdi S. M. Sajjadi

Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction

Diffusion models have made significant strides in image generation, mastering tasks such as unconditional image synthesis, text-image translation, and image-to-image conversions. However, their capability falls short in the realm of video…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Gaurav Shrivastava , Abhinav Shrivastava

Flexible Diffusion Modeling of Long Videos

We present a framework for video modeling based on denoising diffusion probabilistic models that produces long-duration video completions in a variety of realistic environments. We introduce a generative model that can at test-time sample…

Computer Vision and Pattern Recognition · Computer Science 2022-12-19 William Harvey , Saeid Naderiparizi , Vaden Masrani , Christian Weilbach , Frank Wood

Generating Long Videos of Dynamic Scenes

We present a video generation model that accurately reproduces object motion, changes in camera viewpoint, and new content that arises over time. Existing video generation methods often fail to produce new content as a function of time…

Computer Vision and Pattern Recognition · Computer Science 2022-06-10 Tim Brooks , Janne Hellsten , Miika Aittala , Ting-Chun Wang , Timo Aila , Jaakko Lehtinen , Ming-Yu Liu , Alexei A. Efros , Tero Karras

Grid Diffusion Models for Text-to-Video Generation

Recent advances in the diffusion models have significantly improved text-to-image generation. However, generating videos from text is a more challenging task than generating images from text, due to the much larger dataset and higher…

Computer Vision and Pattern Recognition · Computer Science 2024-12-31 Taegyeong Lee , Soyeong Kwon , Taehwan Kim

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Text-to-video generation aims to produce a video based on a given prompt. Recently, several commercial video models have been able to generate plausible videos with minimal noise, excellent details, and high aesthetic scores. However, these…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Haoxin Chen , Yong Zhang , Xiaodong Cun , Menghan Xia , Xintao Wang , Chao Weng , Ying Shan

Conditional Video Generation for High-Efficiency Video Compression

Perceptual studies demonstrate that conditional diffusion models excel at reconstructing video content aligned with human visual perception. Building on this insight, we propose a video compression framework that leverages conditional…

Computer Vision and Pattern Recognition · Computer Science 2025-09-26 Fangqiu Yi , Jingyu Xu , Jiawei Shao , Chi Zhang , Xuelong Li

Dreamix: Video Diffusion Models are General Video Editors

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the…

Computer Vision and Pattern Recognition · Computer Science 2023-02-03 Eyal Molad , Eliahu Horwitz , Dani Valevski , Alex Rav Acha , Yossi Matias , Yael Pritch , Yaniv Leviathan , Yedid Hoshen

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

In this paper, we present VideoGen, a text-to-video generation approach, which can generate a high-definition video with high frame fidelity and strong temporal consistency using reference-guided latent diffusion. We leverage an…

Computer Vision and Pattern Recognition · Computer Science 2023-09-08 Xin Li , Wenqing Chu , Ye Wu , Weihang Yuan , Fanglong Liu , Qi Zhang , Fu Li , Haocheng Feng , Errui Ding , Jingdong Wang

GD-VDM: Generated Depth for better Diffusion-based Video Generation

The field of generative models has recently witnessed significant progress, with diffusion models showing remarkable performance in image generation. In light of this success, there is a growing interest in exploring the application of…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Ariel Lapid , Idan Achituve , Lior Bracha , Ethan Fetaya

Survey of Video Diffusion Models: Foundations, Implementations, and Applications

Recent advances in diffusion models have revolutionized video generation, offering superior temporal consistency and visual quality compared to traditional generative adversarial networks-based approaches. While this emerging field shows…

Computer Vision and Pattern Recognition · Computer Science 2026-02-11 Yimu Wang , Xuye Liu , Wei Pang , Li Ma , Shuai Yuan , Paul Debevec , Ning Yu

VideoMerge: Towards Training-free Long Video Generation

Long video generation remains a challenging and compelling topic in computer vision. Diffusion based models, among the various approaches to video generation, have achieved state of the art quality with their iterative denoising procedures.…

Computer Vision and Pattern Recognition · Computer Science 2025-03-14 Siyang Zhang , Harry Yang , Ser-Nam Lim

Structure and Content-Guided Video Synthesis with Diffusion Models

Text-guided generative diffusion models unlock powerful image creation and editing tools. While these have been extended to video generation, current approaches that edit the content of existing footage while retaining structure require…

Computer Vision and Pattern Recognition · Computer Science 2023-02-07 Patrick Esser , Johnathan Chiu , Parmida Atighehchian , Jonathan Granskog , Anastasis Germanidis

Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models

In this study, we present an efficient and effective approach for achieving temporally consistent synthetic-to-real video translation in videos of varying lengths. Our method leverages off-the-shelf conditional image diffusion models,…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Ernie Chu , Shuo-Yen Lin , Jun-Cheng Chen

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower-dimensional latent space. Here, we apply the LDM paradigm to high-resolution…

Computer Vision and Pattern Recognition · Computer Science 2023-12-29 Andreas Blattmann , Robin Rombach , Huan Ling , Tim Dockhorn , Seung Wook Kim , Sanja Fidler , Karsten Kreis