English
Related papers

Related papers: Generative Latent Diffusion for Efficient Spatiote…

200 papers

Modern video codecs and learning-based approaches struggle for semantic reconstruction at extremely low bit-rates due to reliance on low-level spatiotemporal redundancies. Generative models, especially diffusion models, offer a new paradigm…

Image and Video Processing · Electrical Eng. & Systems 2026-02-06 Maojun Zhang , Haotian Wu , Richeng Jin , Deniz Gunduz , Krystian Mikolajczyk

This paper outlines an end-to-end optimized lossy image compression framework using diffusion generative models. The approach relies on the transform coding paradigm, where an image is mapped into a latent space for entropy coding and, from…

Image and Video Processing · Electrical Eng. & Systems 2024-01-03 Ruihan Yang , Stephan Mandt

Deep learning models have significantly improved the visual quality and accuracy on compressive sensing recovery. In this paper, we propose an algorithm for signal reconstruction from compressed measurements with image priors captured by a…

Machine Learning · Computer Science 2020-03-20 Shaojie Xu , Sihan Zeng , Justin Romberg

Perceptual studies demonstrate that conditional diffusion models excel at reconstructing video content aligned with human visual perception. Building on this insight, we propose a video compression framework that leverages conditional…

Computer Vision and Pattern Recognition · Computer Science 2025-09-26 Fangqiu Yi , Jingyu Xu , Jiawei Shao , Chi Zhang , Xuelong Li

Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Theodoros Kouzelis , Efstathios Karypidis , Ioannis Kakogeorgiou , Spyros Gidaris , Nikos Komodakis

As generative technologies advance, visual content has evolved into a complex mix of natural and AI-generated images, driving the need for more efficient coding techniques that prioritize perceptual quality. Traditional codecs and learned…

Computer Vision and Pattern Recognition · Computer Science 2025-09-18 Jianhui Chang

In this study we develop dimension-reduction techniques to accelerate diffusion model inference in the context of synthetic data generation. The idea is to integrate compressed sensing into diffusion models (hence, CSDM): First, compress…

Machine Learning · Statistics 2025-09-30 Zhengyi Guo , Jiatu Li , Wenpin Tang , David D. Yao

Video variational autoencoders (VAEs) used in latent diffusion models typically require a sufficiently large number of latent channels to ensure high-quality video reconstruction. However, recent studies have revealed that an excessive…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Jiarui Guan , Wenshuai Zhao , Zhengtao Zou , Juho Kannala , Arno Solin

Diffusion models have achieved remarkable success in generating high quality image and video data. More recently, they have also been used for image compression with high perceptual quality. In this paper, we present a novel approach to…

Image and Video Processing · Electrical Eng. & Systems 2024-02-15 Bohan Li , Yiming Liu , Xueyan Niu , Bo Bai , Lei Deng , Deniz Gündüz

Popularized by their strong image generation performance, diffusion and related methods for generative modeling have found widespread success in visual media applications. In particular, diffusion methods have enabled new approaches to data…

Image and Video Processing · Electrical Eng. & Systems 2026-01-28 Yibo Yang , Stephan Mandt

Latent variable generative models have emerged as powerful tools for generative tasks including image and video synthesis. These models are enabled by pretrained autoencoders that map high resolution data into a compressed lower dimensional…

Computer Vision and Pattern Recognition · Computer Science 2025-06-13 Mohammed Suhail , Carlos Esteves , Leonid Sigal , Ameesh Makadia

A generative modeling framework is proposed that combines diffusion models and manifold learning to efficiently sample data densities on manifolds. The approach utilizes Diffusion Maps to uncover possible low-dimensional underlying (latent)…

Machine Learning · Computer Science 2025-04-22 Dimitris G. Giovanis , Ellis Crabtree , Roger G. Ghanem , Ioannis G. Kevrekidis

While generative models have seen significant adoption across a wide range of data modalities, including 3D data, a consensus on which model is best suited for which task has yet to be reached. Further, conditional information such as text…

Computer Vision and Pattern Recognition · Computer Science 2026-01-21 Matthias Humt , Ulrich Hillenbrand , Rudolph Triebel

Recent advances in generative modeling, namely Diffusion models, have revolutionized generative modeling, enabling high-quality image generation tailored to user needs. This paper proposes a framework for the generative design of structural…

Multi-modal data-sets are ubiquitous in modern applications, and multi-modal Variational Autoencoders are a popular family of models that aim to learn a joint representation of the different modalities. However, existing approaches suffer…

Machine Learning · Computer Science 2023-12-19 Mustapha Bounoua , Giulio Franzese , Pietro Michiardi

Diffusion probabilistic models have achieved enormous success in the field of image generation and manipulation. In this paper, we explore a novel paradigm of using the diffusion model and classifier guidance in the latent semantic space…

Computer Vision and Pattern Recognition · Computer Science 2023-05-25 Changhao Shi , Haomiao Ni , Kai Li , Shaobo Han , Mingfu Liang , Martin Renqiang Min

Although deep learning has achieved appealing results on several machine learning tasks, most of the models are deterministic at inference, limiting their application to single-modal settings. We propose a novel general-purpose framework…

Machine Learning · Computer Science 2020-10-12 Sameera Ramasinghe , Kanchana Ranasinghe , Salman Khan , Nick Barnes , Stephen Gould

We present LTM3D, a Latent Token space Modeling framework for conditional 3D shape generation that integrates the strengths of diffusion and auto-regressive (AR) models. While diffusion-based methods effectively model continuous latent…

Computer Vision and Pattern Recognition · Computer Science 2025-06-02 Xin Kang , Zihan Zheng , Lei Chu , Yue Gao , Jiahao Li , Hao Pan , Xuejin Chen , Yan Lu

In this age of information, images are a critical medium for storing and transmitting information. With the rapid growth of image data amount, visual compression and visual data perception are two important research topics attracting a lot…

Image and Video Processing · Electrical Eng. & Systems 2024-07-02 Yuefeng Zhang , Chuanmin Jia , Jiannhui Chang , Siwei Ma

In this work, we present GPDiT, a Generative Pre-trained Autoregressive Diffusion Transformer that unifies the strengths of diffusion and autoregressive modeling for long-range video synthesis, within a continuous latent space. Instead of…

Computer Vision and Pattern Recognition · Computer Science 2025-10-09 Yuan Zhang , Jiacheng Jiang , Guoqing Ma , Zhiying Lu , Haoyang Huang , Jianlong Yuan , Nan Duan , Daxin Jiang
‹ Prev 1 2 3 10 Next ›