Related papers: VIDM: Video Implicit Diffusion Models

Video Diffusion Models

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Jonathan Ho , Tim Salimans , Alexey Gritsenko , William Chan , Mohammad Norouzi , David J. Fleet

GD-VDM: Generated Depth for better Diffusion-based Video Generation

The field of generative models has recently witnessed significant progress, with diffusion models showing remarkable performance in image generation. In light of this success, there is a growing interest in exploring the application of…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Ariel Lapid , Idan Achituve , Lior Bracha , Ethan Fetaya

Video Interpolation with Diffusion Models

We present VIDIM, a generative model for video interpolation, which creates short videos given a start and end frame. In order to achieve high fidelity and generate motions unseen in the input data, VIDIM uses cascaded diffusion models to…

Computer Vision and Pattern Recognition · Computer Science 2024-04-02 Siddhant Jain , Daniel Watson , Eric Tabellion , Aleksander Hołyński , Ben Poole , Janne Kontkanen

Video Probabilistic Diffusion Models in Projected Latent Space

Despite the remarkable progress in deep generative models, synthesizing high-resolution and temporally coherent videos still remains a challenge due to their high-dimensionality and complex temporal dynamics along with large spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Sihyun Yu , Kihyuk Sohn , Subin Kim , Jinwoo Shin

LaMD: Latent Motion Diffusion for Image-Conditional Video Generation

The video generation field has witnessed rapid improvements with the introduction of recent diffusion models. While these models have successfully enhanced appearance quality, they still face challenges in generating coherent and natural…

Computer Vision and Pattern Recognition · Computer Science 2025-04-21 Yaosi Hu , Zhenzhong Chen , Chong Luo

IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation

We introduce a novel approach for high-resolution talking head generation from a single image and audio input. Prior methods using explicit face models, like 3D morphable models (3DMM) and facial landmarks, often fall short in generating…

Computer Vision and Pattern Recognition · Computer Science 2024-12-11 Sejong Yang , Seoung Wug Oh , Yang Zhou , Seon Joo Kim

Conditional Video Generation for High-Efficiency Video Compression

Perceptual studies demonstrate that conditional diffusion models excel at reconstructing video content aligned with human visual perception. Building on this insight, we propose a video compression framework that leverages conditional…

Computer Vision and Pattern Recognition · Computer Science 2025-09-26 Fangqiu Yi , Jingyu Xu , Jiawei Shao , Chi Zhang , Xuelong Li

3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models

Diffusion models have shown great promise for image generation, beating GANs in terms of generation diversity, with comparable image quality. However, their application to 3D shapes has been limited to point or voxel representations that…

Computer Vision and Pattern Recognition · Computer Science 2022-12-16 Gimin Nam , Mariem Khlifi , Andrew Rodriguez , Alberto Tono , Linqi Zhou , Paul Guerrero

Rethinking Video Super-Resolution: Towards Diffusion-Based Methods without Motion Alignment

In this work, we rethink the approach to video super-resolution by introducing a method based on the Diffusion Posterior Sampling framework, combined with an unconditional video diffusion transformer operating in latent space. The video…

Computer Vision and Pattern Recognition · Computer Science 2025-11-05 Zhihao Zhan , Wang Pang , Xiang Zhu , Yechao Bai

From Image to Video: An Empirical Study of Diffusion Representations

Diffusion models have revolutionized generative modeling, enabling unprecedented realism in image and video synthesis. This success has sparked interest in leveraging their representations for visual understanding tasks. While recent works…

Computer Vision and Pattern Recognition · Computer Science 2025-03-21 Pedro Vélez , Luisa F. Polanía , Yi Yang , Chuhan Zhang , Rishabh Kabra , Anurag Arnab , Mehdi S. M. Sajjadi

Diffusion Models for Video Prediction and Infilling

Predicting and anticipating future outcomes or reasoning about missing information in a sequence are critical skills for agents to be able to make intelligent decisions. This requires strong, temporally coherent generative capabilities.…

Computer Vision and Pattern Recognition · Computer Science 2022-11-15 Tobias Höppe , Arash Mehrjou , Stefan Bauer , Didrik Nielsen , Andrea Dittadi

Dreamix: Video Diffusion Models are General Video Editors

Text-driven image and video diffusion models have recently achieved unprecedented generation realism. While diffusion models have been successfully applied for image editing, very few works have done so for video editing. We present the…

Computer Vision and Pattern Recognition · Computer Science 2023-02-03 Eyal Molad , Eliahu Horwitz , Dani Valevski , Alex Rav Acha , Yossi Matias , Yael Pritch , Yaniv Leviathan , Yedid Hoshen

Video Diffusion Models: A Survey

Diffusion generative models have recently become a powerful technique for creating and modifying high-quality, coherent video content. This survey provides a comprehensive overview of the critical components of diffusion models for video…

Computer Vision and Pattern Recognition · Computer Science 2024-11-19 Andrew Melnik , Michal Ljubljanac , Cong Lu , Qi Yan , Weiming Ren , Helge Ritter

Imagen Video: High Definition Video Generation with Diffusion Models

We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of…

Computer Vision and Pattern Recognition · Computer Science 2022-10-06 Jonathan Ho , William Chan , Chitwan Saharia , Jay Whang , Ruiqi Gao , Alexey Gritsenko , Diederik P. Kingma , Ben Poole , Mohammad Norouzi , David J. Fleet , Tim Salimans

Extreme Video Compression with Pre-trained Diffusion Models

Diffusion models have achieved remarkable success in generating high quality image and video data. More recently, they have also been used for image compression with high perceptual quality. In this paper, we present a novel approach to…

Image and Video Processing · Electrical Eng. & Systems 2024-02-15 Bohan Li , Yiming Liu , Xueyan Niu , Bo Bai , Lei Deng , Deniz Gündüz

Accelerating Video Diffusion Models via Distribution Matching

Generative models, particularly diffusion models, have made significant success in data synthesis across various modalities, including images, videos, and 3D assets. However, current diffusion models are computationally intensive, often…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Yuanzhi Zhu , Hanshu Yan , Huan Yang , Kai Zhang , Junnan Li

Latent Video Diffusion Models for High-Fidelity Long Video Generation

AI-generated content has attracted lots of attention recently, but photo-realistic video synthesis is still challenging. Although many attempts using GANs and autoregressive models have been made in this area, the visual quality and length…

Computer Vision and Pattern Recognition · Computer Science 2023-03-21 Yingqing He , Tianyu Yang , Yong Zhang , Ying Shan , Qifeng Chen

Fashion-VDM: Video Diffusion Model for Virtual Try-On

We present Fashion-VDM, a video diffusion model (VDM) for generating virtual try-on videos. Given an input garment image and person video, our method aims to generate a high-quality try-on video of the person wearing the given garment,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-05 Johanna Karras , Yingwei Li , Nan Liu , Luyang Zhu , Innfarn Yoo , Andreas Lugmayr , Chris Lee , Ira Kemelmacher-Shlizerman

Diffusion-aided Extreme Video Compression with Lightweight Semantics Guidance

Modern video codecs and learning-based approaches struggle for semantic reconstruction at extremely low bit-rates due to reliance on low-level spatiotemporal redundancies. Generative models, especially diffusion models, offer a new paradigm…

Image and Video Processing · Electrical Eng. & Systems 2026-02-06 Maojun Zhang , Haotian Wu , Richeng Jin , Deniz Gunduz , Krystian Mikolajczyk

Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction

Diffusion models have made significant strides in image generation, mastering tasks such as unconditional image synthesis, text-image translation, and image-to-image conversions. However, their capability falls short in the realm of video…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Gaurav Shrivastava , Abhinav Shrivastava