English
Related papers

Related papers: Improved Vector Quantized Diffusion Models

200 papers

We present the vector quantized diffusion (VQ-Diffusion) model for text-to-image generation. This method is based on a vector quantized variational autoencoder (VQ-VAE) whose latent space is modeled by a conditional variant of the recently…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Shuyang Gu , Dong Chen , Jianmin Bao , Fang Wen , Bo Zhang , Dongdong Chen , Lu Yuan , Baining Guo

By embedding discrete representations into a continuous latent space, we can leverage continuous-space latent diffusion models to handle generative modeling of discrete data. However, despite their initial success, most latent diffusion…

Machine Learning · Computer Science 2025-04-02 Bac Nguyen , Chieh-Hsin Lai , Yuhta Takida , Naoki Murata , Toshimitsu Uesaka , Stefano Ermon , Yuki Mitsufuji

Text-to-image diffusion models have emerged as a powerful framework for high-quality image generation given textual prompts. Their success has driven the rapid development of production-grade diffusion models that consistently increase in…

Computer Vision and Pattern Recognition · Computer Science 2024-09-04 Vage Egiazarian , Denis Kuznedelev , Anton Voronov , Ruslan Svirschevski , Michael Goin , Daniil Pavlov , Dan Alistarh , Dmitry Baranchuk

The design of image and video quality assessment (QA) algorithms is extremely important to benchmark and calibrate user experience in modern visual systems. A major drawback of the state-of-the-art QA methods is their limited ability to…

Image and Video Processing · Electrical Eng. & Systems 2025-12-30 Shankhanil Mitra , Diptanu De , Shika Rao , Rajiv Soundararajan

Diffusionmodels(DMs)havedemonstratedremarkableachievements in synthesizing images of high fidelity and diversity. However, the extensive computational requirements and slow generative speed of diffusion models have limited their widespread…

Computer Vision and Pattern Recognition · Computer Science 2025-01-03 Jiaojiao Ye , Zhen Wang , Linnan Jiang

Classifier-free guided diffusion models have recently been shown to be highly effective at high-resolution image generation, and they have been widely used in large-scale diffusion frameworks including DALLE-2, Stable Diffusion and Imagen.…

Computer Vision and Pattern Recognition · Computer Science 2023-04-14 Chenlin Meng , Robin Rombach , Ruiqi Gao , Diederik P. Kingma , Stefano Ermon , Jonathan Ho , Tim Salimans

We show that diffusion models can achieve image sample quality superior to the current state-of-the-art generative models. We achieve this on unconditional image synthesis by finding a better architecture through a series of ablations. For…

Machine Learning · Computer Science 2021-06-02 Prafulla Dhariwal , Alex Nichol

The primary axes of interest in image-generating diffusion models are image quality, the amount of variation in the results, and how well the results align with a given condition, e.g., a class label or a text prompt. The popular…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Tero Karras , Miika Aittala , Tuomas Kynkäänniemi , Jaakko Lehtinen , Timo Aila , Samuli Laine

Vector Quantized-Variational AutoEncoders (VQ-VAE) are generative models based on discrete latent representations of the data, where inputs are mapped to a finite set of learned embeddings.To generate new samples, an autoregressive prior…

Machine Learning · Statistics 2022-08-04 Max Cohen , Guillaume Quispe , Sylvain Le Corff , Charles Ollion , Eric Moulines

Since 2023, Vector Quantization (VQ)-based discrete generation methods have rapidly dominated human motion generation, primarily surpassing diffusion-based continuous generation methods in standard performance metrics. However, VQ-based…

Computer Vision and Pattern Recognition · Computer Science 2025-07-10 Zichong Meng , Yiming Xie , Xiaogang Peng , Zeyu Han , Huaizu Jiang

The integration of Vector Quantised Variational AutoEncoder (VQ-VAE) with autoregressive models as generation part has yielded high-quality results on image generation. However, the autoregressive models will strictly follow the progressive…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Minghui Hu , Yujie Wang , Tat-Jen Cham , Jianfei Yang , P. N. Suganthan

Vector quantization (VQ) based ANN indexes, such as Inverted File System (IVF) and Product Quantization (PQ), have been widely applied to embedding based document retrieval thanks to the competitive time and memory efficiency. Originally,…

Information Retrieval · Computer Science 2022-04-29 Shitao Xiao , Zheng Liu , Weihao Han , Jianjin Zhang , Defu Lian , Yeyun Gong , Qi Chen , Fan Yang , Hao Sun , Yingxia Shao , Denvy Deng , Qi Zhang , Xing Xie

Diffusion models have achieved great success in image synthesis through iterative noise estimation using deep neural networks. However, the slow inference, high memory consumption, and computation intensity of the noise estimation model…

Computer Vision and Pattern Recognition · Computer Science 2023-06-09 Xiuyu Li , Yijiang Liu , Long Lian , Huanrui Yang , Zhen Dong , Daniel Kang , Shanghang Zhang , Kurt Keutzer

Recently, Vector Quantized AutoRegressive (VQ-AR) models have shown remarkable results in text-to-image synthesis by equally predicting discrete image tokens from the top left to bottom right in the latent space. Although the simple…

Computer Vision and Pattern Recognition · Computer Science 2023-09-21 Zhengcong Fei , Mingyuan Fan , Li Zhu , Junshi Huang

Diffusion models have gained popularity for generating images from textual descriptions. Nonetheless, the substantial need for computational resources continues to present a noteworthy challenge, contributing to time-consuming processes.…

Computer Vision and Pattern Recognition · Computer Science 2023-11-30 Hanwen Chang , Haihao Shen , Yiyang Cai , Xinyu Ye , Zhenzhong Xu , Wenhua Cheng , Kaokao Lv , Weiwei Zhang , Yintong Lu , Heng Guo

Significant investments have been made towards the commodification of diffusion models for generation of diverse media. Their mass-market adoption is however still hobbled by the intense hardware resource requirements of diffusion model…

Machine Learning · Computer Science 2025-06-10 Adil Hasan , Thomas Peyrin

Diffusion models have recently dominated image synthesis tasks. However, the iterative denoising process is expensive in computations at inference time, making diffusion models less practical for low-latency and scalable real-world…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Yefei He , Luping Liu , Jing Liu , Weijia Wu , Hong Zhou , Bohan Zhuang

Current image-to-image translation methods formulate the task with conditional generation models, leading to learning only the recolorization or regional changes as being constrained by the rich structural information provided by the…

Computer Vision and Pattern Recognition · Computer Science 2022-07-28 Yu-Jie Chen , Shin-I Cheng , Wei-Chen Chiu , Hung-Yu Tseng , Hsin-Ying Lee

Vector Quantization (VQ) is a well-known technique in deep learning for extracting informative discrete latent representations. VQ-embedded models have shown impressive results in a range of applications including image and speech…

Machine Learning · Computer Science 2023-10-05 Tanmay Gautam , Reid Pryzant , Ziyi Yang , Chenguang Zhu , Somayeh Sojoudi

There has been a significant progress in text conditional image generation models. Recent advancements in this field depend not only on improvements in model structures, but also vast quantities of text-image paired datasets. However,…

Computer Vision and Pattern Recognition · Computer Science 2024-03-25 Seungdae Han , Joohee Kim
‹ Prev 1 2 3 10 Next ›