English
Related papers

Related papers: Randomized Autoregressive Visual Generation

200 papers

Visual autoregressive (AR) generation offers a promising path toward unifying vision and language models, yet its performance remains suboptimal against diffusion models. Prior work often attributes this gap to tokenizer limitations and…

Computer Vision and Pattern Recognition · Computer Science 2025-10-07 Qiyuan He , Yicong Li , Haotian Ye , Jinghao Wang , Xinyao Liao , Pheng-Ann Heng , Stefano Ermon , James Zou , Angela Yao

We introduce a new paradigm for AutoRegressive (AR) image generation, termed Set AutoRegressive Modeling (SAR). SAR generalizes the conventional AR to the next-set setting, i.e., splitting the sequence into arbitrary sets containing…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Wenze Liu , Le Zhuo , Yi Xin , Sheng Xia , Peng Gao , Xiangyu Yue

Autoregressive (AR) models for image generation typically adopt a two-stage paradigm of vector quantization and raster-scan ``next-token prediction", inspired by its great success in language modeling. However, due to the huge modality gap,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Hu Yu , Hao Luo , Hangjie Yuan , Yu Rong , Jie Huang , Feng Zhao

Inspired by the remarkable success of autoregressive models in language modeling, this paradigm has been widely adopted in visual generation. However, the sequential token-by-token decoding mechanism inherent in traditional autoregressive…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Siyang Wang , Hanting Li , Wei Li , Jie Hu , Xinghao Chen , Feng Zhao

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Keyu Tian , Yi Jiang , Zehuan Yuan , Bingyue Peng , Liwei Wang

Autoregressive models have recently shown great promise in visual generation by leveraging discrete token sequences akin to language modeling. However, existing approaches often suffer from inefficiency, either due to token-by-token…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 Ruiqing Yang , Kaixin Zhang , Zheng Zhang , Shan You , Tao Huang

Autoregressive (AR) image generators offer a language-model-friendly approach to image generation by predicting discrete image tokens in a causal sequence. However, unlike diffusion models, AR models lack a mechanism to refine previous…

Computer Vision and Pattern Recognition · Computer Science 2026-01-29 Cheng Cheng , Lin Song , Di An , Yicheng Xiao , Xuchong Zhang , Hongbin Sun , Ying Shan

Recent advances in autoregressive (AR) generative models have produced increasingly powerful systems for media synthesis. Among them, next-scale prediction has emerged as a popular paradigm, where models generate images in a coarse-to-fine…

Computer Vision and Pattern Recognition · Computer Science 2025-12-09 Gengze Zhou , Chongjian Ge , Hao Tan , Feng Liu , Yicong Hong

Visual autoregressive models typically adhere to a raster-order ``next-token prediction" paradigm, which overlooks the spatial and temporal locality inherent in visual content. Specifically, visual tokens exhibit significantly stronger…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Yefei He , Yuanyu He , Shaoxuan He , Feng Chen , Hong Zhou , Kaipeng Zhang , Bohan Zhuang

Recent advances in video generation have been dominated by diffusion and flow-matching models, which produce high-quality results but remain computationally intensive and difficult to scale. In this work, we introduce VideoAR, the first…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Longbin Ji , Xiaoxiong Liu , Junyuan Shang , Shuohuan Wang , Yu Sun , Hua Wu , Haifeng Wang

Visual Auto-Regressive modeling (VAR) has shown promise in bridging the speed and quality gap between autoregressive image models and diffusion models. VAR reformulates autoregressive modeling by decomposing an image into successive…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Hermann Kumbong , Xian Liu , Tsung-Yi Lin , Ming-Yu Liu , Xihui Liu , Ziwei Liu , Daniel Y. Fu , Christopher Ré , David W. Romero

We introduce ARPG, a novel visual Autoregressive model that enables Randomized Parallel Generation, addressing the inherent limitations of conventional raster-order approaches, which hinder inference efficiency and zero-shot generalization…

Computer Vision and Pattern Recognition · Computer Science 2026-03-02 Haopeng Li , Jinyue Yang , Guoqi Li , Huan Wang

Autoregressive models have demonstrated remarkable success in sequential data generation, particularly in NLP, but their extension to continuous-domain image generation presents significant challenges. Recent work, the masked autoregressive…

Computer Vision and Pattern Recognition · Computer Science 2025-04-28 Tiankai Hang , Jianmin Bao , Fangyun Wei , Dong Chen

Autoregressive (AR) modeling has achieved remarkable success in natural language processing by enabling models to generate text with coherence and contextual understanding through next token prediction. Recently, in image generation, VAR…

Computer Vision and Pattern Recognition · Computer Science 2024-12-20 Sucheng Ren , Qihang Yu , Ju He , Xiaohui Shen , Alan Yuille , Liang-Chieh Chen

Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Yunpeng Qu , Kun Yuan , Jinhua Hao , Kai Zhao , Qizhi Xie , Ming Sun , Chao Zhou

Existing autoregressive (AR) image generative models use a token-by-token generation schema. That is, they predict a per-token probability distribution and sample the next token from that distribution. The main challenge is how to model the…

Computer Vision and Pattern Recognition · Computer Science 2025-03-05 Qinyu Zhao , Stephen Gould , Liang Zheng

Autoregressive models recently achieved comparable results versus state-of-the-art Generative Adversarial Networks (GANs) with the help of Vector Quantized Variational AutoEncoders (VQ-VAE). However, autoregressive models have several…

Computer Vision and Pattern Recognition · Computer Science 2020-07-21 Kenan E. Ak , Ning Xu , Zhe Lin , Yilin Wang

We introduce RandAR, a decoder-only visual autoregressive (AR) model capable of generating images in arbitrary token orders. Unlike previous decoder-only AR models that rely on a predefined generation order, RandAR removes this inductive…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Ziqi Pang , Tianyuan Zhang , Fujun Luan , Yunze Man , Hao Tan , Kai Zhang , William T. Freeman , Yu-Xiong Wang

Conventional wisdom suggests that autoregressive models are used to process discrete data. When applied to continuous modalities such as visual data, Visual AutoRegressive modeling (VAR) typically resorts to quantization-based approaches to…

Computer Vision and Pattern Recognition · Computer Science 2025-05-13 Chenze Shao , Fandong Meng , Jie Zhou

AutoRegressive (AR) models have made notable progress in image generation, with Masked AutoRegressive (MAR) models gaining attention for their efficient parallel decoding. However, MAR models have traditionally underperformed when compared…

Computer Vision and Pattern Recognition · Computer Science 2025-07-18 Yi Xin , Le Zhuo , Qi Qin , Siqi Luo , Yuewen Cao , Bin Fu , Yangfan He , Hongsheng Li , Guangtao Zhai , Xiaohong Liu , Peng Gao
‹ Prev 1 2 3 10 Next ›