Related papers: Depth Adaptive Efficient Visual Autoregressive Mod…

StepVAR: Structure-Texture Guided Pruning for Visual Autoregressive Models

Visual AutoRegressive (VAR) models based on next-scale prediction enable efficient hierarchical generation, yet the inference cost grows quadratically at high resolutions. We observe that the computationally intensive later scales…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Keli Liu , Zhendong Wang , Wengang Zhou , Houqiang Li

ActVAR: Activating Mixtures of Weights and Tokens for Efficient Visual Autoregressive Generation

Visual Autoregressive (VAR) models enable efficient image generation via next-scale prediction but face escalating computational costs as sequence length grows. Existing static pruning methods degrade performance by permanently removing…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Kaixin Zhang , Ruiqing Yang , Yuan Zhang , Shan You , Tao Huang

FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning

Visual Autoregressive (VAR) modeling has gained popularity for its shift towards next-scale prediction. However, existing VAR paradigms process the entire token map at each scale step, leading to the complexity and runtime scaling…

Computer Vision and Pattern Recognition · Computer Science 2025-07-09 Hang Guo , Yawei Li , Taolin Zhang , Jiangshan Wang , Tao Dai , Shu-Tao Xia , Luca Benini

Visual Implicit Autoregressive Modeling

Visual Autoregressive Modeling (VAR) based on next-scale prediction achieves strong generation quality, but their explicit deep stacks fix the amount of computation per scale and inflate memory at high resolutions. We introduce Visual…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Pengfei Jiang , Jixiang Luo , Luxi Lin , Zhaohong Huang , Xuelong Li

Visual Autoregressive Modelling for Monocular Depth Estimation

We propose a monocular depth estimation method based on visual autoregressive (VAR) priors, offering an alternative to diffusion-based approaches. Our method adapts a large-scale text-to-image VAR model and introduces a scale-wise…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Amir El-Ghoussani , André Kaup , Nassir Navab , Gustavo Carneiro , Vasileios Belagiannis

Progressive Supernet Training for Efficient Visual Autoregressive Modeling

Visual Auto-Regressive (VAR) models significantly reduce inference steps through the "next-scale" prediction paradigm. However, progressive multi-scale generation incurs substantial memory overhead due to cumulative KV caching, limiting…

Computer Vision and Pattern Recognition · Computer Science 2025-11-21 Xiaoyue Chen , Yuling Shi , Kaiyuan Li , Huandong Wang , Yong Li , Xiaodong Gu , Xinlei Chen , Mingbao Lin

ToProVAR: Efficient Visual Autoregressive Modeling via Tri-Dimensional Entropy-Aware Semantic Analysis and Sparsity Optimization

Visual Autoregressive(VAR) models enhance generation quality but face a critical efficiency bottleneck in later stages. In this paper, we present a novel optimization framework for VAR models that fundamentally differs from prior approaches…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Jiayu Chen , Ruoyu Lin , Zihao Zheng , Jingxin Li , Maoliang Li , Guojie Luo , Xiang Chen

LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization

Visual Autoregressive (VAR) has emerged as a promising approach in image generation, offering competitive potential and performance comparable to diffusion-based models. However, current AR-based visual generation models require substantial…

Computer Vision and Pattern Recognition · Computer Science 2024-11-27 Rui Xie , Tianchen Zhao , Zhihang Yuan , Rui Wan , Wenxi Gao , Zhenhua Zhu , Xuefei Ning , Yu Wang

Diversity Has Always Been There in Your Visual Autoregressive Models

Visual Autoregressive (VAR) models have recently garnered significant attention for their innovative next-scale prediction paradigm, offering notable advantages in both inference efficiency and image quality compared to traditional…

Computer Vision and Pattern Recognition · Computer Science 2025-11-24 Tong Wang , Guanyu Yang , Nian Liu , Kai Wang , Yaxing Wang , Abdelrahman M Shaker , Salman Khan , Fahad Shahbaz Khan , Senmao Li

Fine-Tuning Visual Autoregressive Models for Subject-Driven Generation

Recent advances in text-to-image generative models have enabled numerous practical applications, including subject-driven generation, which fine-tunes pretrained models to capture subject semantics from only a few examples. While…

Computer Vision and Pattern Recognition · Computer Science 2025-07-31 Jiwoo Chung , Sangeek Hyun , Hyunjun Kim , Eunseo Koh , MinKyu Lee , Jae-Pil Heo

Rethinking Structure Preservation in Text-Guided Image Editing with Visual Autoregressive Models

Visual autoregressive (VAR) models have recently emerged as a promising family of generative models, enabling a wide range of downstream vision tasks such as text-guided image editing. By shifting the editing paradigm from noise…

Computer Vision and Pattern Recognition · Computer Science 2026-03-31 Tao Xia , Jiawei Liu , Yukun Zhang , Ting Liu , Wei Wang , Lei Zhang

DepthART: Monocular Depth Estimation as Autoregressive Refinement Task

Monocular depth estimation has seen significant advances through discriminative approaches, yet their performance remains constrained by the limitations of training datasets. While generative approaches have addressed this challenge by…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Bulat Gabdullin , Nina Konovalova , Nikolay Patakin , Dmitry Senushkin , Anton Konushin

Implementing Adaptations for Vision AutoRegressive Model

Vision AutoRegressive model (VAR) was recently introduced as an alternative to Diffusion Models (DMs) in image generation domain. In this work we focus on its adaptations, which aim to fine-tune pre-trained models to perform specific…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Kaif Shaikh , Franziska Boenisch , Adam Dziedzic

Dynamic Mixture-of-Experts for Visual Autoregressive Model

Visual Autoregressive Models (VAR) offer efficient and high-quality image generation but suffer from computational redundancy due to repeated Transformer calls at increasing resolutions. We introduce a dynamic Mixture-of-Experts router…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Jort Vincenti , Metod Jazbec , Guoxuan Xia

Adversarial Error Correction for Visual Autoregressive Generation

Visual Autoregressive (VAR) models have emerged as a powerful paradigm for image synthesis by performing hierarchical next-scale prediction. However, VAR models are inherently prone to cascading error propagation, where subtle coarse-scale…

Computer Vision and Pattern Recognition · Computer Science 2026-05-26 Ligong Bi , Tao Huang , Jianyuan Guo , Chang Xu

HMAR: Efficient Hierarchical Masked Auto-Regressive Image Generation

Visual Auto-Regressive modeling (VAR) has shown promise in bridging the speed and quality gap between autoregressive image models and diffusion models. VAR reformulates autoregressive modeling by decomposing an image into successive…

Computer Vision and Pattern Recognition · Computer Science 2025-06-06 Hermann Kumbong , Xian Liu , Tsung-Yi Lin , Ming-Yu Liu , Xihui Liu , Ziwei Liu , Daniel Y. Fu , Christopher Ré , David W. Romero

FasterVAR: Plug-and-Play Acceleration for Visual Autoregressive Models

Visual Autoregressive (VAR) modeling departs from the next-token prediction paradigm of traditional Autoregressive (AR) models through next-scale prediction, enabling high-quality image generation. However, the VAR paradigm suffers from…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Senmao Li , Kai Wang , Salman Khan , Fahad Shahbaz Khan , Jian Yang , Yaxing Wang

Inference-Time Scaling for Visual AutoRegressive modeling by Searching Representative Samples

While inference-time scaling has significantly enhanced generative quality in large language and diffusion models, its application to vector-quantized (VQ) visual autoregressive modeling (VAR) remains unexplored. We introduce VAR-Scaling,…

Computer Vision and Pattern Recognition · Computer Science 2026-01-13 Weidong Tang , Xinyan Wan , Siyu Li , Xiumei Wang

StyleVAR: Controllable Image Style Transfer via Visual Autoregressive Modeling

We build on the Visual Autoregressive Modeling (VAR) framework and formulate style transfer as conditional discrete sequence modeling in a learned latent space. Images are decomposed into multi-scale representations and tokenized into…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Liqi Jing , Dingming Zhang , Peinian Li , Lichen Zhu , Yang Xu , Hanyu Xing

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

We present Visual AutoRegressive modeling (VAR), a new generation paradigm that redefines the autoregressive learning on images as coarse-to-fine "next-scale prediction" or "next-resolution prediction", diverging from the standard…

Computer Vision and Pattern Recognition · Computer Science 2024-06-11 Keyu Tian , Yi Jiang , Zehuan Yuan , Bingyue Peng , Liwei Wang