English
Related papers

Related papers: Autoregressive Image Generation with Randomized Pa…

200 papers

Autoregressive models have emerged as a powerful approach for visual generation but suffer from slow inference speed due to their sequential token-by-token prediction process. In this paper, we propose a simple yet effective approach for…

Computer Vision and Pattern Recognition · Computer Science 2025-04-04 Yuqing Wang , Shuhuai Ren , Zhijie Lin , Yujin Han , Haoyuan Guo , Zhenheng Yang , Difan Zou , Jiashi Feng , Xihui Liu

We present Locality-aware Parallel Decoding (LPD) to accelerate autoregressive image generation. Traditional autoregressive image generation relies on next-patch prediction, a memory-bound process that leads to high latency. Existing works…

Computer Vision and Pattern Recognition · Computer Science 2026-03-12 Zhuoyang Zhang , Luke J. Huang , Chengyue Wu , Shang Yang , Kelly Peng , Yao Lu , Song Han

Inspired by the remarkable success of autoregressive models in language modeling, this paradigm has been widely adopted in visual generation. However, the sequential token-by-token decoding mechanism inherent in traditional autoregressive…

Computer Vision and Pattern Recognition · Computer Science 2026-01-01 Siyang Wang , Hanting Li , Wei Li , Jie Hu , Xinghao Chen , Feng Zhao

Autoregressive models, built based on the Next Token Prediction (NTP) paradigm, show great potential in developing a unified framework that integrates both language and vision tasks. Pioneering works introduce NTP to autoregressive visual…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Yatian Pang , Peng Jin , Shuo Yang , Bin Lin , Bin Zhu , Zhenyu Tang , Liuhan Chen , Francis E. H. Tay , Ser-Nam Lim , Harry Yang , Li Yuan

This paper presents Randomized AutoRegressive modeling (RAR) for visual generation, which sets a new state-of-the-art performance on the image generation task while maintaining full compatibility with language modeling frameworks. The…

Computer Vision and Pattern Recognition · Computer Science 2024-11-04 Qihang Yu , Ju He , Xueqing Deng , Xiaohui Shen , Liang-Chieh Chen

We introduce Autoregressive Retrieval Augmentation (AR-RAG), a novel paradigm that enhances image generation by autoregressively incorporating knearest neighbor retrievals at the patch level. Unlike prior methods that perform a single,…

Computer Vision and Pattern Recognition · Computer Science 2025-06-17 Jingyuan Qi , Zhiyang Xu , Qifan Wang , Lifu Huang

In this work, we first revisit the sampling issues in current autoregressive (AR) image generation models and identify that image tokens, unlike text tokens, exhibit lower information density and non-uniform spatial distribution.…

Computer Vision and Pattern Recognition · Computer Science 2025-10-21 Xiaoxiao Ma , Feng Zhao , Pengyang Ling , Haibo Qiu , Zhixiang Wei , Hu Yu , Jie Huang , Zhixiong Zeng , Lin Ma

Autoregressive models have recently shown great promise in visual generation by leveraging discrete token sequences akin to language modeling. However, existing approaches often suffer from inefficiency, either due to token-by-token…

Computer Vision and Pattern Recognition · Computer Science 2025-11-20 Ruiqing Yang , Kaixin Zhang , Zheng Zhang , Shan You , Tao Huang

Visual autoregressive models typically adhere to a raster-order ``next-token prediction" paradigm, which overlooks the spatial and temporal locality inherent in visual content. Specifically, visual tokens exhibit significantly stronger…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Yefei He , Yuanyu He , Shaoxuan He , Feng Chen , Hong Zhou , Kaipeng Zhang , Bohan Zhuang

Autoregressive Transformer models have demonstrated impressive performance in video generation, but their sequential token-by-token decoding process poses a major bottleneck, particularly for long videos represented by tens of thousands of…

Computer Vision and Pattern Recognition · Computer Science 2025-03-19 Yang Ye , Junliang Guo , Haoyu Wu , Tianyu He , Tim Pearce , Tabish Rashid , Katja Hofmann , Jiang Bian

We propose a novel AutoRegressive Generation-based paradigm for image Segmentation (ARGenSeg), achieving multimodal understanding and pixel-level perception within a unified framework. Prior works integrating image segmentation into…

Computer Vision and Pattern Recognition · Computer Science 2025-10-24 Xiaolong Wang , Lixiang Ru , Ziyuan Huang , Kaixiang Ji , Dandan Zheng , Jingdong Chen , Jun Zhou

Prevailing autoregressive (AR) models for text-to-image generation either rely on heavy, computationally-intensive diffusion models to process continuous image tokens, or employ vector quantization (VQ) to obtain discrete tokens with…

Autoregressive (AR) models, the theoretical performance benchmark for learned lossless image compression, are often dismissed as impractical due to prohibitive computational cost. This work re-thinks this paradigm, introducing a framework…

Computer Vision and Pattern Recognition · Computer Science 2025-11-17 Daxin Li , Yuanchao Bai , Kai Wang , Wenbo Zhao , Junjun Jiang , Xianming Liu

Autoregressive (AR) models for image generation typically adopt a two-stage paradigm of vector quantization and raster-scan ``next-token prediction", inspired by its great success in language modeling. However, due to the huge modality gap,…

Computer Vision and Pattern Recognition · Computer Science 2026-03-19 Hu Yu , Hao Luo , Hangjie Yuan , Yu Rong , Jie Huang , Feng Zhao

We introduce RandAR, a decoder-only visual autoregressive (AR) model capable of generating images in arbitrary token orders. Unlike previous decoder-only AR models that rely on a predefined generation order, RandAR removes this inductive…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Ziqi Pang , Tianyuan Zhang , Fujun Luan , Yunze Man , Hao Tan , Kai Zhang , William T. Freeman , Yu-Xiong Wang

In this paper, we propose ZipAR, a training-free, plug-and-play parallel decoding framework for accelerating auto-regressive (AR) visual generation. The motivation stems from the observation that images exhibit local structures, and…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Yefei He , Feng Chen , Yuanyu He , Shaoxuan He , Hong Zhou , Kaipeng Zhang , Bohan Zhuang

The raster-ordered image token sequence exhibits a significant Euclidean distance between index-adjacent tokens at line breaks, making it unsuitable for autoregressive generation. To address this issue, this paper proposes Direction-Aware…

Computer Vision and Pattern Recognition · Computer Science 2025-04-17 Yijia Xu , Jianzhong Ju , Jian Luan , Jinshi Cui

Autoregressive image and video generators are trained with teacher-forced histories but must sample from their own generated prefixes at inference time, making them vulnerable to exposure bias and prefix drift. Existing remedies either…

Computer Vision and Pattern Recognition · Computer Science 2026-05-29 Xinyao Liao , Qiyuan He , Yicong Li , Jiayin Zhu , Xiaoye Qu , Wei Wei , Angela Yao

A key challenge in autoregressive image generation is to efficiently sample independent locations in parallel, while still modeling mutual dependencies with serial conditioning. Some recent works have addressed this by conditioning between…

Computer Vision and Pattern Recognition · Computer Science 2026-02-26 David Eigen

Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Yunpeng Qu , Kun Yuan , Jinhua Hao , Kai Zhao , Qizhi Xie , Ming Sun , Chao Zhou
‹ Prev 1 2 3 10 Next ›