English
Related papers

Related papers: Autoregressive Image Generation using Residual Qua…

200 papers

Existing vector quantization (VQ) based autoregressive models follow a two-stage generation paradigm that first learns a codebook to encode images as discrete codes, and then completes generation based on the learned codebook. However, they…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Mengqi Huang , Zhendong Mao , Zhuowei Chen , Yongdong Zhang

Existing autoregressive models follow the two-stage generation paradigm that first learns a codebook in the latent space for image reconstruction and then completes the image generation autoregressively based on the learned codebook.…

Computer Vision and Pattern Recognition · Computer Science 2023-05-24 Mengqi Huang , Zhendong Mao , Quan Wang , Yongdong Zhang

Although autoregressive models have achieved promising results on image generation, their unidirectional generation process prevents the resultant images from fully reflecting global contexts. To address the issue, we propose an effective…

Computer Vision and Pattern Recognition · Computer Science 2022-06-10 Doyup Lee , Chiheon Kim , Saehoon Kim , Minsu Cho , Wook-Shin Han

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of…

Computer Vision and Pattern Recognition · Computer Science 2022-08-10 Mohammad Adiban , Kalin Stefanov , Sabato Marco Siniscalchi , Giampiero Salvi

We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher…

Machine Learning · Computer Science 2019-06-04 Ali Razavi , Aaron van den Oord , Oriol Vinyals

Image tokenizers play a critical role in shaping the performance of subsequent generative models. Since the introduction of VQ-GAN, discrete image tokenization has undergone remarkable advancements. Improvements in architecture,…

Computer Vision and Pattern Recognition · Computer Science 2024-12-03 Xiang Li , Kai Qiu , Hao Chen , Jason Kuen , Jiuxiang Gu , Jindong Wang , Zhe Lin , Bhiksha Raj

We introduce ResGen, an efficient Residual Vector Quantization (RVQ)-based generative model for high-fidelity generation with fast sampling. RVQ improves data fidelity by increasing the number of quantization steps, referred to as depth,…

Machine Learning · Computer Science 2025-06-03 Jaehyeon Kim , Taehong Moon , Keon Lee , Jaewoong Cho

The integration of Vector Quantised Variational AutoEncoder (VQ-VAE) with autoregressive models as generation part has yielded high-quality results on image generation. However, the autoregressive models will strictly follow the progressive…

Computer Vision and Pattern Recognition · Computer Science 2024-03-01 Minghui Hu , Yujie Wang , Tat-Jen Cham , Jianfei Yang , P. N. Suganthan

The Residual Quantization (RQ) framework is revisited where the quantization distortion is being successively reduced in multi-layers. Inspired by the reverse-water-filling paradigm in rate-distortion theory, an efficient regularization on…

Machine Learning · Computer Science 2017-05-02 Sohrab Ferdowsi , Slava Voloshynovskiy , Dimche Kostadinov

The emergence of visual autoregressive (AR) models has revolutionized image generation while presenting new challenges for synthetic image detection. Unlike previous GAN or diffusion-based methods, AR models generate images through discrete…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Yanran Zhang , Bingyao Yu , Yu Zheng , Wenzhao Zheng , Yueqi Duan , Lei Chen , Jie Zhou , Jiwen Lu

We present the vector quantized diffusion (VQ-Diffusion) model for text-to-image generation. This method is based on a vector quantized variational autoencoder (VQ-VAE) whose latent space is modeled by a conditional variant of the recently…

Computer Vision and Pattern Recognition · Computer Science 2022-03-04 Shuyang Gu , Dong Chen , Jianmin Bao , Fang Wen , Bo Zhang , Dongdong Chen , Lu Yuan , Baining Guo

Vector quantization (VQ) transforms continuous image features into discrete representations, providing compressed, tokenized inputs for generative models. However, VQ-based frameworks suffer from several issues, such as non-smooth latent…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Sicheng Yang , Xing Hu , Qiang Wu , Dawei Yang

Quantization methods have been introduced to perform large scale approximate nearest search tasks. Residual Vector Quantization (RVQ) is one of the effective quantization methods. RVQ uses a multi-stage codebook learning scheme to lower the…

Computer Vision and Pattern Recognition · Computer Science 2015-09-18 Shicong Liu , Hongtao Lu , Junru Shao

Vector-Quantized Variational Autoencoders (VQ-VAE)[1] provide an unsupervised model for learning discrete representations by combining vector quantization and autoencoders. In this paper, we study the use of VQ-VAE for representation…

Image and Video Processing · Electrical Eng. & Systems 2019-03-05 Hanwei Wu , Markus Flierl

Product Quantization (PQ) has long been a mainstream for generating an exponentially large codebook at very low memory/time cost. Despite its success, PQ is still tricky for the decomposition of high-dimensional vector space, and the…

Computer Vision and Pattern Recognition · Computer Science 2020-12-08 Lianli Gao , Xiaosu Zhu , Jingkuan Song , Zhou Zhao , Heng Tao Shen

Although two-stage Vector Quantized (VQ) generative models allow for synthesizing high-fidelity and high-resolution images, their quantization operator encodes similar patches within an image into the same index, resulting in a repeated…

Computer Vision and Pattern Recognition · Computer Science 2022-09-20 Chuanxia Zheng , Long Tung Vuong , Jianfei Cai , Dinh Phung

Autoregressive (AR) models have recently shown strong performance in image generation, where a critical component is the visual tokenizer (VT) that maps continuous pixel inputs to discrete token sequences. The quality of the VT largely…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Huawei Lin , Tong Geng , Zhaozhuo Xu , Weijie Zhao

Vector quantization (VQ) is a technique to deterministically learn features with discrete codebook representations. It is commonly performed with a variational autoencoding model, VQ-VAE, which can be further extended to hierarchical…

Recently, Vector Quantized AutoRegressive (VQ-AR) models have shown remarkable results in text-to-image synthesis by equally predicting discrete image tokens from the top left to bottom right in the latent space. Although the simple…

Computer Vision and Pattern Recognition · Computer Science 2023-09-21 Zhengcong Fei , Mingyuan Fan , Li Zhu , Junshi Huang

A learning-based framework for representation of domain-specific images is proposed where joint compression and denoising can be done using a VQ-based multi-layer network. While it learns to compress the images from a training set, the…

Computer Vision and Pattern Recognition · Computer Science 2017-07-10 Sohrab Ferdowsi , Slava Voloshynovskiy , Dimche Kostadinov
‹ Prev 1 2 3 10 Next ›