Related papers: Binary Latent Diffusion

Binary Noise for Binary Tasks: Masked Bernoulli Diffusion for Unsupervised Anomaly Detection

The high performance of denoising diffusion models for image generation has paved the way for their application in unsupervised medical anomaly detection. As diffusion-based methods require a lot of GPU memory and have long sampling times,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-19 Julia Wolleb , Florentin Bieder , Paul Friedrich , Peter Zhang , Alicia Durrer , Philippe C. Cattin

Encoding Binary Concepts in the Latent Space of Generative Models for Enhancing Data Representation

Binary concepts are empirically used by humans to generalize efficiently. And they are based on Bernoulli distribution which is the building block of information. These concepts span both low-level and high-level features such as "large vs…

Machine Learning · Computer Science 2023-03-23 Zizhao Hu , Mohammad Rostami

Latent Beam Diffusion Models for Generating Visual Sequences

While diffusion models excel at generating high-quality images from text prompts, they struggle with visual consistency when generating image sequences. Existing methods generate each image independently, leading to disjointed narratives -…

Computer Vision and Pattern Recognition · Computer Science 2025-09-24 Guilherme Fernandes , Vasco Ramos , Regev Cohen , Idan Szpektor , João Magalhães

Generative Latent Diffusion for Efficient Spatiotemporal Data Reduction

Generative models have demonstrated strong performance in conditional settings and can be viewed as a form of data compression, where the condition serves as a compact representation. However, their limited controllability and…

Machine Learning · Computer Science 2025-07-04 Xiao Li , Liangji Zhu , Anand Rangarajan , Sanjay Ranka

Latent Space Imaging

Digital imaging systems have traditionally relied on brute-force measurement and processing of pixels arranged on regular grids. In contrast, the human visual system performs significant data reduction from the large number of…

Image and Video Processing · Electrical Eng. & Systems 2025-03-25 Matheus Souza , Yidan Zheng , Kaizhang Kang , Yogeshwar Nath Mishra , Qiang Fu , Wolfgang Heidrich

Transparent Image Layer Diffusion using Latent Transparency

We present LayerDiffuse, an approach enabling large-scale pretrained latent diffusion models to generate transparent images. The method allows generation of single transparent images or of multiple transparent layers. The method learns a…

Computer Vision and Pattern Recognition · Computer Science 2024-06-25 Lvmin Zhang , Maneesh Agrawala

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Latent diffusion models (LDMs) dominate high-quality image generation, yet integrating representation learning with generative modeling remains a challenge. We introduce a novel generative image modeling framework that seamlessly bridges…

Computer Vision and Pattern Recognition · Computer Science 2026-01-23 Theodoros Kouzelis , Efstathios Karypidis , Ioannis Kakogeorgiou , Spyros Gidaris , Nikos Komodakis

Instella-T2I: Pushing the Limits of 1D Discrete Latent Space Image Generation

Image tokenization plays a critical role in reducing the computational demands of modeling high-resolution images, significantly improving the efficiency of image and multimodal understanding and generation. Recent advances in 1D latent…

Computer Vision and Pattern Recognition · Computer Science 2025-06-27 Ze Wang , Hao Chen , Benran Hu , Jiang Liu , Ximeng Sun , Jialian Wu , Yusheng Su , Xiaodong Yu , Emad Barsoum , Zicheng Liu

High-Resolution Image Synthesis with Latent Diffusion Models

By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond. Additionally, their formulation allows for a…

Computer Vision and Pattern Recognition · Computer Science 2022-04-14 Robin Rombach , Andreas Blattmann , Dominik Lorenz , Patrick Esser , Björn Ommer

Learned Disentangled Latent Representations for Scalable Image Coding for Humans and Machines

As an increasing amount of image and video content will be analyzed by machines, there is demand for a new codec paradigm that is capable of compressing visual input primarily for the purpose of computer vision inference, while secondarily…

Image and Video Processing · Electrical Eng. & Systems 2023-01-12 Ezgi Ozyilkan , Mateen Ulhaq , Hyomin Choi , Fabien Racape

Nested Diffusion Models Using Hierarchical Latent Priors

We introduce nested diffusion models, an efficient and powerful hierarchical generative framework that substantially enhances the generation quality of diffusion models, particularly for images of complex scenes. Our approach employs a…

Computer Vision and Pattern Recognition · Computer Science 2024-12-10 Xiao Zhang , Ruoxi Jiang , Rebecca Willett , Michael Maire

Arbitrary-Scale Image Generation and Upsampling using Latent Diffusion Model and Implicit Neural Decoder

Super-resolution (SR) and image generation are important tasks in computer vision and are widely adopted in real-world applications. Most existing methods, however, generate images only at fixed-scale magnification and suffer from…

Computer Vision and Pattern Recognition · Computer Science 2024-03-18 Jinseok Kim , Tae-Kyun Kim

Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts

Recent advances in image generation have made diffusion models powerful tools for creating high-quality images. However, their iterative denoising process makes understanding and interpreting their semantic latent spaces more challenging…

Computation and Language · Computer Science 2024-11-06 E. Zhixuan Zeng , Yuhao Chen , Alexander Wong

Single-step Diffusion for Image Compression at Ultra-Low Bitrates

Although there have been significant advancements in image compression techniques, such as standard and learned codecs, these methods still suffer from severe quality degradation at extremely low bits per pixel. While recent diffusion-based…

Image and Video Processing · Electrical Eng. & Systems 2025-09-23 Chanung Park , Joo Chan Lee , Jong Hwan Ko

Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation

The steep computational cost of diffusion models at inference hinders their use as fast physics emulators. In the context of image and video generation, this computational drawback has been addressed by generating in the latent space of an…

Machine Learning · Computer Science 2025-11-04 François Rozet , Ruben Ohana , Michael McCabe , Gilles Louppe , François Lanusse , Shirley Ho

Controlling Latent Diffusion Using Latent CLIP

Instead of performing text-conditioned denoising in the image domain, latent diffusion models (LDMs) operate in latent space of a variational autoencoder (VAE), enabling more efficient processing at reduced computational costs. However,…

Computer Vision and Pattern Recognition · Computer Science 2025-03-12 Jason Becker , Chris Wendler , Peter Baylies , Robert West , Christian Wressnegger

Compact Latent Representation for Image Compression (CLRIC)

Current image compression models often require separate models for each quality level, making them resource-intensive in terms of both training and storage. To address these limitations, we propose an innovative approach that utilizes…

Image and Video Processing · Electrical Eng. & Systems 2025-09-30 Ayman A. Ameen , Thomas Richter , André Kaup

Binary Diffusion Probabilistic Model

We propose the Binary Diffusion Probabilistic Model (BDPM), a generative framework specifically designed for data representations in binary form. Conventional denoising diffusion probabilistic models (DDPMs) assume continuous inputs, use…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Vitaliy Kinakh , Slava Voloshynovskiy

High-resolution efficient image generation from WiFi CSI using a pretrained latent diffusion model

We present LatentCSI, a novel method for generating images of the physical environment from WiFi CSI measurements that leverages a pretrained latent diffusion model (LDM). Unlike prior approaches that rely on complex and computationally…

Computer Vision and Pattern Recognition · Computer Science 2025-09-08 Eshan Ramesh , Takayuki Nishio

Towards Extreme Image Compression with Latent Feature Guidance and Diffusion Prior

Image compression at extremely low bitrates (below 0.1 bits per pixel (bpp)) is a significant challenge due to substantial information loss. In this work, we propose a novel two-stage extreme image compression framework that exploits the…

Image and Video Processing · Electrical Eng. & Systems 2024-09-05 Zhiyuan Li , Yanhui Zhou , Hao Wei , Chenyang Ge , Jingwen Jiang