Related papers: Hierarchical Patch Diffusion Models for High-Resol…

High-Resolution Image Synthesis with Latent Diffusion Models

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-14 Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , Björn Ommer

Matryoshka Diffusion Models

Diffusion models are the de facto approach for generating high-quality images and videos, but learning high-dimensional models remains a formidable task due to computational and optimization challenges. Existing methods often resort to…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Jiatao Gu , Shuangfei Zhai , Yizhe Zhang , Josh Susskind , Navdeep Jaitly

Video Probabilistic Diffusion Models in Projected Latent Space

Despite the remarkable progress in deep generative models, synthesizing high-resolution and temporally coherent videos still remains a challenge due to their high-dimensionality and complex temporal dynamics along with large spatial…

Computer Vision and Pattern Recognition · Computer Science 2023-03-31 Sihyun Yu , Kihyuk Sohn , Subin Kim , Jinwoo Shin

Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models

Latent Diffusion Models (LDMs) enable high-quality image synthesis while avoiding excessive compute demands by training a diffusion model in a compressed lower-dimensional latent space. Here, we apply the LDM paradigm to high-resolution…

Computer Vision and Pattern Recognition · Computer Science 2023-12-29 Andreas Blattmann , Robin Rombach , Huan Ling , Tim Dockhorn , Seung Wook Kim , Sanja Fidler , Karsten Kreis

Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models

Diffusion models are powerful, but they require a lot of time and data to train. We propose Patch Diffusion, a generic patch-wise training framework, to significantly reduce the training time costs while improving data efficiency, which…

Computer Vision and Pattern Recognition · Computer Science 2023-10-20 Zhendong Wang , Yifan Jiang , Huangjie Zheng , Peihao Wang , Pengcheng He , Zhangyang Wang , Weizhu Chen , Mingyuan Zhou

Efficient Video Diffusion Models: Advancements and Challenges

Video diffusion models have rapidly become the dominant paradigm for high-fidelity generative video synthesis, but their practical deployment remains constrained by severe inference costs. Compared with image generation, video synthesis…

Computer Vision and Pattern Recognition · Computer Science 2026-04-20 Shitong Shao , Lichen Bai , Pengfei Wan , James Kwok , Zeke Xie

GD-VDM: Generated Depth for better Diffusion-based Video Generation

The field of generative models has recently witnessed significant progress, with diffusion models showing remarkable performance in image generation. In light of this success, there is a growing interest in exploring the application of…

Computer Vision and Pattern Recognition · Computer Science 2023-06-21 Ariel Lapid , Idan Achituve , Lior Bracha , Ethan Fetaya

Improving Progressive Generation with Decomposable Flow Matching

Generating high-dimensional visual modalities is a computationally intensive task. A common solution is progressive generation, where the outputs are synthesized in a coarse-to-fine spectral autoregressive manner. While diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2025-06-25 Moayed Haji-Ali , Willi Menapace , Ivan Skorokhodov , Arpit Sahni , Sergey Tulyakov , Vicente Ordonez , Aliaksandr Siarohin

Patched Denoising Diffusion Models For High-Resolution Image Synthesis

We propose an effective denoising diffusion model for generating high-resolution images (e.g., 1024$\times$512), trained on small-size image patches (e.g., 64$\times$64). We name our algorithm Patch-DM, in which a new feature collage…

Computer Vision and Pattern Recognition · Computer Science 2023-08-03 Zheng Ding , Mengqi Zhang , Jiajun Wu , Zhuowen Tu

VIDM: Video Implicit Diffusion Models

Diffusion models have emerged as a powerful generative method for synthesizing high-quality and diverse set of images. In this paper, we propose a video generation method based on diffusion models, where the effects of motion are modeled in…

Computer Vision and Pattern Recognition · Computer Science 2022-12-02 Kangfu Mei , Vishal M. Patel

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

Diffusion models have achieved great success in synthesizing high-quality images. However, generating high-resolution images with diffusion models is still challenging due to the enormous computational costs, resulting in a prohibitive…

Computer Vision and Pattern Recognition · Computer Science 2024-07-16 Muyang Li , Tianle Cai , Jiaxin Cao , Qinsheng Zhang , Han Cai , Junjie Bai , Yangqing Jia , Ming-Yu Liu , Kai Li , Song Han

Ultra-High-Resolution Image Synthesis with Pyramid Diffusion Model

We introduce the Pyramid Diffusion Model (PDM), a novel architecture designed for ultra-high-resolution image synthesis. PDM utilizes a pyramid latent representation, providing a broader design space that enables more flexible, structured,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-20 Jiajie Yang

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Diffusion models have proven to be highly effective in image and video generation; however, they encounter challenges in the correct composition of objects when generating images of varying sizes due to single-scale training data. Adapting…

Computer Vision and Pattern Recognition · Computer Science 2024-09-23 Lanqing Guo , Yingqing He , Haoxin Chen , Menghan Xia , Xiaodong Cun , Yufei Wang , Siyu Huang , Yong Zhang , Xintao Wang , Qifeng Chen , Ying Shan , Bihan Wen

VideoFusion: Decomposed Diffusion Models for High-Quality Video Generation

A diffusion probabilistic model (DPM), which constructs a forward diffusion process by gradually adding noise to data points and learns the reverse denoising process to generate new samples, has been shown to handle complex data…

Computer Vision and Pattern Recognition · Computer Science 2023-10-16 Zhengxiong Luo , Dayou Chen , Yingya Zhang , Yan Huang , Liang Wang , Yujun Shen , Deli Zhao , Jingren Zhou , Tieniu Tan

Video Diffusion Models

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Jonathan Ho , Tim Salimans , Alexey Gritsenko , William Chan , Mohammad Norouzi , David J. Fleet

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

Pre-trained video generation models hold great potential for generative video super-resolution (VSR). However, adapting them for full-size VSR, as most existing methods do, suffers from unnecessary intensive full-attention computation and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Shian Du , Menghan Xia , Chang Liu , Xintao Wang , Jing Wang , Pengfei Wan , Di Zhang , Xiangyang Ji

Fixed Point Diffusion Models

We introduce the Fixed Point Diffusion Model (FPDM), a novel approach to image generation that integrates the concept of fixed point solving into the framework of diffusion-based generative modeling. Our approach embeds an implicit fixed…

Computer Vision and Pattern Recognition · Computer Science 2024-01-18 Xingjian Bai , Luke Melas-Kyriazi

Pyramid Diffusion for Fine 3D Large Scene Generation

Diffusion models have shown remarkable results in generating 2D images and small-scale 3D objects. However, their application to the synthesis of large-scale 3D scenes has been rarely explored. This is mainly due to the inherent complexity…

Computer Vision and Pattern Recognition · Computer Science 2024-07-19 Yuheng Liu , Xinke Li , Xueting Li , Lu Qi , Chongshou Li , Ming-Hsuan Yang

Efficient Diffusion Models for Vision: A Survey

Diffusion Models (DMs) have demonstrated state-of-the-art performance in content generation without requiring adversarial training. These models are trained using a two-step process. First, a forward - diffusion - process gradually adds…

Computer Vision and Pattern Recognition · Computer Science 2024-03-13 Anwaar Ulhaq , Naveed Akhtar

Hybrid Video Diffusion Models with 2D Triplane and 3D Wavelet Representation

Generating high-quality videos that synthesize desired realistic content is a challenging task due to their intricate high-dimensionality and complexity of videos. Several recent diffusion-based methods have shown comparable performance by…

Computer Vision and Pattern Recognition · Computer Science 2024-04-05 Kihong Kim , Haneol Lee , Jihye Park , Seyeon Kim , Kwanghee Lee , Seungryong Kim , Jaejun Yoo