English
Related papers

Related papers: Visual Implicit Autoregressive Modeling

200 papers

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Keyu Tian , Yi Jiang , Zehuan Yuan , Bingyue Peng , Liwei Wang

Visual AutoRegressive modeling (VAR) based on next-scale prediction has revitalized autoregressive visual generation. Although its full-context dependency, i.e., modeling all previous scales for next-scale prediction, facilitates more…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Yu Zhang , Jingyi Liu , Yiwei Shi , Qi Zhang , Duoqian Miao , Changwei Wang , Longbing Cao

Visual Autoregressive (VAR) models have emerged as a powerful paradigm for image synthesis by performing hierarchical next-scale prediction. However, VAR models are inherently prone to cascading error propagation, where subtle coarse-scale…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Ligong Bi , Tao Huang , Jianyuan Guo , Chang Xu

Recent advances in video generation have been dominated by diffusion and flow-matching models, which produce high-quality results but remain computationally intensive and difficult to scale. In this work, we introduce VideoAR, the first…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Longbin Ji , Xiaoxiong Liu , Junyuan Shang , Shuohuan Wang , Yu Sun , Hua Wu , Haifeng Wang

Visual Auto-Regressive (VAR) models significantly reduce inference steps through the "next-scale" prediction paradigm. However, progressive multi-scale generation incurs substantial memory overhead due to cumulative KV caching, limiting…

Computer Vision and Pattern Recognition · Computer Science 2025-11-21 Xiaoyue Chen , Yuling Shi , Kaiyuan Li , Huandong Wang , Yong Li , Xiaodong Gu , Xinlei Chen , Mingbao Lin

Visual Autoregressive (VAR) modeling inefficiently applies a fixed computational depth to each position when generating high-resolution images. While existing methods accelerate inference by pruning tokens using frequency maps, their binary…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Chunliang Li , Tianze Cao , Sanyuan Zhao

Visual Autoregressive (VAR) has emerged as a promising approach in image generation, offering competitive potential and performance comparable to diffusion-based models. However, current AR-based visual generation models require substantial…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Rui Xie , Tianchen Zhao , Zhihang Yuan , Rui Wan , Wenxi Gao , Zhenhua Zhu , Xuefei Ning , Yu Wang

Visual Autoregressive (VAR) modeling has garnered significant attention for its innovative next-scale prediction approach, which yields substantial improvements in efficiency, scalability, and zero-shot generalization. Nevertheless, the…

Machine Learning · Computer Science 2025-05-27 Kunjun Li , Zigeng Chen , Cheng-Yen Yang , Jenq-Neng Hwang

Visual autoregressive models achieve remarkable generation quality through next-scale predictions across multi-scale token pyramids. However, the conventional method uses uniform scale downsampling to build these pyramids, leading to…

Computer Vision and Pattern Recognition · Computer Science 2025-11-25 Xiaofan Li , Chenming Wu , Yanpeng Sun , Jiaming Zhou , Delin Qu , Yansong Qu , Weihao Bo , Haibao Yu , Dingkang Liang

Conventional wisdom suggests that autoregressive models are used to process discrete data. When applied to continuous modalities such as visual data, Visual AutoRegressive modeling (VAR) typically resorts to quantization-based approaches to…

Computer Vision and Pattern Recognition · Computer Science 2025-05-13 Chenze Shao , Fandong Meng , Jie Zhou

Autoregressive (AR) visual generation has emerged as a powerful paradigm for image and multimodal synthesis, owing to its scalability and generality. However, existing AR image generation suffers from severe memory bottlenecks due to the…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Ziran Qin , Youru Lv , Mingbao Lin , Zeren Zhang , Chanfan Gan , Tieyuan Chen , Weiyao Lin

Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also…

Computer Vision and Pattern Recognition · Computer Science 2025-02-03 Yunpeng Qu , Kun Yuan , Jinhua Hao , Kai Zhao , Qizhi Xie , Ming Sun , Chao Zhou

Visual Auto-Regressive modeling (VAR) has shown promise in bridging the speed and quality gap between autoregressive image models and diffusion models. VAR reformulates autoregressive modeling by decomposing an image into successive…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Hermann Kumbong , Xian Liu , Tsung-Yi Lin , Ming-Yu Liu , Xihui Liu , Ziwei Liu , Daniel Y. Fu , Christopher Ré , David W. Romero

Essential to visual generation is efficient modeling of visual data priors. Conventional next-token prediction methods define the process as learning the conditional probability distribution of successive tokens. Recently, next-scale…

Computer Vision and Pattern Recognition · Computer Science 2026-02-03 Jinhua Zhang , Wei Long , Minghao Han , Weiyi You , Shuhang Gu

This paper presents Randomized AutoRegressive modeling (RAR) for visual generation, which sets a new state-of-the-art performance on the image generation task while maintaining full compatibility with language modeling frameworks. The…

Computer Vision and Pattern Recognition · Computer Science 2024-11-04 Qihang Yu , Ju He , Xueqing Deng , Xiaohui Shen , Liang-Chieh Chen

There exists recent work in computer vision, named VAR, that proposes a new autoregressive paradigm for image generation. Diverging from the vanilla next-token prediction, VAR structurally reformulates the image generation into a coarse to…

Computer Vision and Pattern Recognition · Computer Science 2024-11-18 Sucheng Ren , Yaodong Yu , Nataniel Ruiz , Feng Wang , Alan Yuille , Cihang Xie

Visual autoregressive models typically adhere to a raster-order ``next-token prediction" paradigm, which overlooks the spatial and temporal locality inherent in visual content. Specifically, visual tokens exhibit significantly stronger…

Computer Vision and Pattern Recognition · Computer Science 2025-03-17 Yefei He , Yuanyu He , Shaoxuan He , Feng Chen , Hong Zhou , Kaipeng Zhang , Bohan Zhuang

Visual Autoregressive (VAR) modeling has gained popularity for its shift towards next-scale prediction. However, existing VAR paradigms process the entire token map at each scale step, leading to the complexity and runtime scaling…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Hang Guo , Yawei Li , Taolin Zhang , Jiangshan Wang , Tao Dai , Shu-Tao Xia , Luca Benini

Recent advances in text-to-image generative models have enabled numerous practical applications, including subject-driven generation, which fine-tunes pretrained models to capture subject semantics from only a few examples. While…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Jiwoo Chung , Sangeek Hyun , Hyunjun Kim , Eunseo Koh , MinKyu Lee , Jae-Pil Heo

Recently, Visual Autoregressive ($\mathsf{VAR}$) Models introduced a groundbreaking advancement in the field of image generation, offering a scalable approach through a coarse-to-fine ``next-scale prediction'' paradigm. Suppose that $n$…

Machine Learning · Computer Science 2025-02-04 Yekun Ke , Xiaoyu Li , Yingyu Liang , Zhizhou Sha , Zhenmei Shi , Zhao Song
‹ Prev 1 2 3 10 Next ›