Related papers: The Diffusion Encoder

Latent Diffusion for Language Generation

Diffusion models have achieved great success in modeling continuous data modalities such as images, audio, and video, but have seen limited use in discrete domains such as language. Recent attempts to adapt diffusion to language have…

Computation and Language · Computer Science 2023-11-08 Justin Lovelace , Varsha Kishore , Chao Wan , Eliot Shekhtman , Kilian Q. Weinberger

Empowering Diffusion Models on the Embedding Space for Text Generation

Diffusion models have achieved state-of-the-art synthesis quality on both visual and audio tasks, and recent works further adapt them to textual data by diffusing on the embedding space. In this paper, we conduct systematic studies of the…

Computation and Language · Computer Science 2024-04-23 Zhujin Gao , Junliang Guo , Xu Tan , Yongxin Zhu , Fang Zhang , Jiang Bian , Linli Xu

Identity Encoder for Personalized Diffusion

Many applications can benefit from personalized image generation models, including image enhancement, video conferences, just to name a few. Existing works achieved personalization by fine-tuning one model for each person. While being…

Computer Vision and Pattern Recognition · Computer Science 2023-04-18 Yu-Chuan Su , Kelvin C. K. Chan , Yandong Li , Yang Zhao , Han Zhang , Boqing Gong , Huisheng Wang , Xuhui Jia

Transformer-based Learned Image Compression for Joint Decoding and Denoising

This work introduces a Transformer-based image compression system. It has the flexibility to switch between the standard image reconstruction and the denoising reconstruction from a single compressed bitstream. Instead of training separate…

Image and Video Processing · Electrical Eng. & Systems 2024-02-21 Yi-Hsin Chen , Kuan-Wei Ho , Shiau-Rung Tsai , Guan-Hsun Lin , Alessandro Gnutti , Wen-Hsiao Peng , Riccardo Leonardi

Correcting Diffusion-Based Perceptual Image Compression with Privileged End-to-End Decoder

The images produced by diffusion models can attain excellent perceptual quality. However, it is challenging for diffusion models to guarantee distortion, hence the integration of diffusion models and image compression models still needs…

Image and Video Processing · Electrical Eng. & Systems 2024-05-03 Yiyang Ma , Wenhan Yang , Jiaying Liu

Geometry-Preserving Encoder/Decoder in Latent Generative Models

Generative modeling aims to generate new data samples that resemble a given dataset, with diffusion models recently becoming the most popular generative model. One of the main challenges of diffusion models is solving the problem in the…

Numerical Analysis · Mathematics 2025-10-08 Wonjun Lee , Riley C. W. O'Neill , Dongmian Zou , Jeff Calder , Gilad Lerman

Diffusion Autoencoders: Toward a Meaningful and Decodable Representation

Diffusion probabilistic models (DPMs) have achieved remarkable quality in image generation that rivals GANs'. But unlike GANs, DPMs use a set of latent variables that lack semantic meaning and cannot serve as a useful representation for…

Computer Vision and Pattern Recognition · Computer Science 2022-03-14 Konpat Preechakul , Nattanat Chatthee , Suttisak Wizadwongsa , Supasorn Suwajanakorn

Diffusion-Based Representation Learning

Diffusion-based methods represented as stochastic differential equations on a continuous-time domain have recently proven successful as a non-adversarial generative model. Training such models relies on denoising score matching, which can…

Machine Learning · Computer Science 2024-11-05 Sarthak Mittal , Korbinian Abstreiter , Stefan Bauer , Bernhard Schölkopf , Arash Mehrjou

Encoder-Decoder Diffusion Language Models for Efficient Training and Inference

Discrete diffusion models enable parallel token sampling for faster inference than autoregressive approaches. However, prior diffusion models use a decoder-only architecture, which requires sampling algorithms that invoke the full network…

Machine Learning · Computer Science 2025-10-28 Marianne Arriola , Yair Schiff , Hao Phung , Aaron Gokaslan , Volodymyr Kuleshov

PosDiffAE: Position-aware Diffusion Auto-encoder For High-Resolution Brain Tissue Classification Incorporating Artifact Restoration

Denoising diffusion models produce high-fidelity image samples by capturing the image distribution in a progressive manner while initializing with a simple distribution and compounding the distribution complexity. Although these models have…

Computer Vision and Pattern Recognition · Computer Science 2025-07-04 Ayantika Das , Moitreya Chaudhuri , Koushik Bhat , Keerthi Ram , Mihail Bota , Mohanasankar Sivaprakasam

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

One of the main drawback of diffusion models is the slow inference time for image generation. Among the most successful approaches to addressing this problem are distillation methods. However, these methods require considerable…

Computer Vision and Pattern Recognition · Computer Science 2024-10-16 Senmao Li , Taihang Hu , Joost van de Weijer , Fahad Shahbaz Khan , Tao Liu , Linxuan Li , Shiqi Yang , Yaxing Wang , Ming-Ming Cheng , Jian Yang

DiffEnc: Variational Diffusion with a Learned Encoder

Diffusion models may be viewed as hierarchical variational autoencoders (VAEs) with two improvements: parameter sharing for the conditional distributions in the generative process and efficient computation of the loss as independent terms…

Machine Learning · Computer Science 2025-10-20 Beatrix M. G. Nielsen , Anders Christensen , Andrea Dittadi , Ole Winther

Neural Network Diffusion

Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also \textit{generate high-performing neural network parameters}. Our approach is simple, utilizing an…

Machine Learning · Computer Science 2025-01-03 Kai Wang , Dongwen Tang , Boya Zeng , Yida Yin , Zhaopan Xu , Yukun Zhou , Zelin Zang , Trevor Darrell , Zhuang Liu , Yang You

Variational Diffusion Channel Decoder

Neural channel decoder, as a data-driven channel decoding strategy, has shown very promising improvement on error-correcting capability over the classical methods. However, the success of those deep learning-based decoder comes at the cost…

Information Theory · Computer Science 2026-05-20 Chengwei Zhang , Yifan Du , Siyu Liao

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired…

Machine Learning · Computer Science 2023-05-30 Qitian Wu , Chenxiao Yang , Wentao Zhao , Yixuan He , David Wipf , Junchi Yan

Latent-Compressed Variational Autoencoder for Video Diffusion Models

Video variational autoencoders (VAEs) used in latent diffusion models typically require a sufficiently large number of latent channels to ensure high-quality video reconstruction. However, recent studies have revealed that an excessive…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Jiarui Guan , Wenshuai Zhao , Zhengtao Zou , Juho Kannala , Arno Solin

How to Train Your Latent Diffusion Language Model Jointly With the Latent Space

Latent diffusion models offer an attractive alternative to discrete diffusion for non-autoregressive text generation by operating on continuous text representations and denoising entire sequences in parallel. The major challenge in latent…

Computation and Language · Computer Science 2026-05-11 Viacheslav Meshchaninov , Alexander Shabalin , Egor Chimbulatov , Nikita Gushchin , Ilya Koziev , Alexander Korotin , Dmitry Vetrov

On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning

Diffusion autoencoders (DAs) are variants of diffusion generative models that use an input-dependent latent variable to capture representations alongside the diffusion process. These representations, to varying extents, can be used for…

Machine Learning · Computer Science 2025-06-03 Magdalena Proszewska , Nikolay Malkin , N. Siddharth

Enhancing the Rate-Distortion-Perception Flexibility of Learned Image Codecs with Conditional Diffusion Decoders

Learned image compression codecs have recently achieved impressive compression performances surpassing the most efficient image coding architectures. However, most approaches are trained to minimize rate and distortion which often leads to…

Computer Vision and Pattern Recognition · Computer Science 2024-03-06 Daniele Mari , Simone Milani

Fast Training of Diffusion Models with Masked Transformers

We propose an efficient approach to train large diffusion models with masked transformers. While masked transformers have been extensively explored for representation learning, their application to generative learning is less explored in…

Computer Vision and Pattern Recognition · Computer Science 2024-03-06 Hongkai Zheng , Weili Nie , Arash Vahdat , Anima Anandkumar