Related papers: AR-Diffusion: Asynchronous Video Generation with A…

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction

Recent advances in video generation have been dominated by diffusion and flow-matching models, which produce high-quality results but remain computationally intensive and difficult to scale. In this work, we introduce VideoAR, the first…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Longbin Ji , Xiaoxiong Liu , Junyuan Shang , Shuohuan Wang , Yu Sun , Hua Wu , Haifeng Wang

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation

Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance. Their success has been recently expanded to text generation via generating all tokens within a sequence concurrently.…

Computation and Language · Computer Science 2023-12-14 Tong Wu , Zhihao Fan , Xiao Liu , Yeyun Gong , Yelong Shen , Jian Jiao , Hai-Tao Zheng , Juntao Li , Zhongyu Wei , Jian Guo , Nan Duan , Weizhu Chen

VideoSSM: Autoregressive Long Video Generation with Hybrid State-Space Memory

Autoregressive (AR) diffusion enables streaming, interactive long-video generation by producing frames causally, yet maintaining coherence over minute-scale horizons remains challenging due to accumulated errors, motion drift, and content…

Computer Vision and Pattern Recognition · Computer Science 2025-12-05 Yifei Yu , Xiaoshan Wu , Xinting Hu , Tao Hu , Yangtian Sun , Xiaoyang Lyu , Bo Wang , Lin Ma , Yuewen Ma , Zhongrui Wang , Xiaojuan Qi

Real-Time Motion-Controllable Autoregressive Video Diffusion

Real-time motion-controllable video generation remains challenging due to the inherent latency of bidirectional diffusion models and the lack of effective autoregressive (AR) approaches. Existing AR video diffusion models are limited to…

Computer Vision and Pattern Recognition · Computer Science 2026-03-10 Kesen Zhao , Jiaxin Shi , Beier Zhu , Junbao Zhou , Xiaolong Shen , Yuan Zhou , Qianru Sun , Hanwang Zhang

Autoregression-free video prediction using diffusion model for mitigating error propagation

Existing long-term video prediction methods often rely on an autoregressive video prediction mechanism. However, this approach suffers from error propagation, particularly in distant future frames. To address this limitation, this paper…

Computer Vision and Pattern Recognition · Computer Science 2025-06-02 Woonho Ko , Jin Bok Park , Il Yong Chun

ART$\boldsymbol{\cdot}$V: Auto-Regressive Text-to-Video Generation with Diffusion Models

We present ART$\boldsymbol{\cdot}$V, an efficient framework for auto-regressive video generation with diffusion models. Unlike existing methods that generate entire videos in one-shot, ART$\boldsymbol{\cdot}$V generates a single frame at a…

Computer Vision and Pattern Recognition · Computer Science 2023-12-01 Wenming Weng , Ruoyu Feng , Yanhui Wang , Qi Dai , Chunyu Wang , Dacheng Yin , Zhiyuan Zhao , Kai Qiu , Jianmin Bao , Yuhui Yuan , Chong Luo , Yueyi Zhang , Zhiwei Xiong

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes

Recent advances in diffusion models have improved controllable streetscape generation and supported downstream perception and planning tasks. However, challenges remain in accurately modeling driving scenes and generating long videos. To…

Computer Vision and Pattern Recognition · Computer Science 2025-05-30 Jianbiao Mei , Tao Hu , Xuemeng Yang , Licheng Wen , Yu Yang , Tiantian Wei , Yukai Ma , Min Dou , Botian Shi , Yong Liu

Streaming Autoregressive Video Generation via Diagonal Distillation

Large pretrained diffusion models have significantly enhanced the quality of generated videos, and yet their use in real-time streaming remains limited. Autoregressive models offer a natural framework for sequential frame synthesis but…

Computer Vision and Pattern Recognition · Computer Science 2026-03-12 Jinxiu Liu , Xuanming Liu , Kangfu Mei , Yandong Wen , Ming-Hsuan Yang , Weiyang Liu

FAR-Drive: Frame-AutoRegressive Video Generation in Closed-Loop Autonomous Driving

Despite rapid progress in autonomous driving, reliable training and evaluation of driving systems remain fundamentally constrained by the lack of scalable and interactive simulation environments. Recent generative video models achieve…

Computer Vision and Pattern Recognition · Computer Science 2026-03-17 Yaoru Li , Federico Landi , Marco Godi , Xin Jin , Ruiju Fu , Yufei Ma , Muyang Sun , Heyu Si , Qi Guo

Not All Frames Deserve Full Computation: Accelerating Autoregressive Video Generation via Selective Computation and Predictive Extrapolation

Autoregressive (AR) video diffusion models enable long-form video generation but remain expensive due to repeated multi-step denoising. Existing training-free acceleration methods rely on binary cache-or-recompute decisions, overlooking…

Computer Vision and Pattern Recognition · Computer Science 2026-04-06 Hanshuai Cui , Zhiqing Tang , Zhi Yao , Fanshuai Meng , Weijia Jia , Wei Zhao

End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

Autoregressive video diffusion models hold promise for world simulation but are vulnerable to exposure bias arising from the train-test mismatch. While recent works address this via post-training, they typically rely on a bidirectional…

Computer Vision and Pattern Recognition · Computer Science 2025-12-18 Yuwei Guo , Ceyuan Yang , Hao He , Yang Zhao , Meng Wei , Zhenheng Yang , Weilin Huang , Dahua Lin

From Slow Bidirectional to Fast Autoregressive Video Diffusion Models

Current video diffusion models achieve impressive generation quality but struggle in interactive applications due to bidirectional attention dependencies. The generation of a single frame requires the model to process the entire sequence,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-25 Tianwei Yin , Qiang Zhang , Richard Zhang , William T. Freeman , Fredo Durand , Eli Shechtman , Xun Huang

Recurrent Autoregressive Diffusion: Global Memory Meets Local Attention

Recent advancements in video generation have demonstrated the potential of using video diffusion models as world models, with autoregressive generation of infinitely long videos through masked conditioning. However, such models, usually…

Computer Vision and Pattern Recognition · Computer Science 2025-11-18 Taiye Chen , Zihan Ding , Anjian Li , Christina Zhang , Zeqi Xiao , Yisen Wang , Chi Jin

Video Diffusion Models

Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial…

Computer Vision and Pattern Recognition · Computer Science 2022-06-24 Jonathan Ho , Tim Salimans , Alexey Gritsenko , William Chan , Mohammad Norouzi , David J. Fleet

Anchored Diffusion for Video Face Reenactment

Video generation has drawn significant interest recently, pushing the development of large-scale models capable of producing realistic videos with coherent motion. Due to memory constraints, these models typically generate short video…

Computer Vision and Pattern Recognition · Computer Science 2024-07-23 Idan Kligvasser , Regev Cohen , George Leifman , Ehud Rivlin , Michael Elad

Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models

Diffusion models have emerged as powerful generative frameworks by progressively adding noise to data through a forward process and then reversing this process to generate realistic samples. While these models have achieved strong…

Machine Learning · Computer Science 2025-03-04 Xingzhuo Guo , Yu Zhang , Baixu Chen , Haoran Xu , Jianmin Wang , Mingsheng Long

FilmWeaver: Weaving Consistent Multi-Shot Videos with Cache-Guided Autoregressive Diffusion

Current video generation models perform well at single-shot synthesis but struggle with multi-shot videos, facing critical challenges in maintaining character and background consistency across shots and flexibly generating videos of…

Computer Vision and Pattern Recognition · Computer Science 2025-12-15 Xiangyang Luo , Qingyu Li , Xiaokun Liu , Wenyu Qin , Miao Yang , Meng Wang , Pengfei Wan , Di Zhang , Kun Gai , Shao-Lun Huang

Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework

Auto-Regressive Video Diffusion Models (AR-VDMs) have shown strong capabilities in generating long, photorealistic videos, but suffer from two key limitations: (i) history forgetting, where the model loses track of previously generated…

Computer Vision and Pattern Recognition · Computer Science 2025-12-30 Jing Wang , Fengzhuo Zhang , Xiaoli Li , Vincent Y. F. Tan , Tianyu Pang , Chao Du , Aixin Sun , Zhuoran Yang

Progressive Autoregressive Video Diffusion Models

Current frontier video diffusion models have demonstrated remarkable results at generating high-quality videos. However, they can only generate short video clips, normally around 10 seconds or 240 frames, due to computation limitations…

Computer Vision and Pattern Recognition · Computer Science 2025-05-20 Desai Xie , Zhan Xu , Yicong Hong , Hao Tan , Difan Liu , Feng Liu , Arie Kaufman , Yang Zhou

CAR: Controllable Autoregressive Modeling for Visual Generation

Controllable generation, which enables fine-grained control over generated outputs, has emerged as a critical focus in visual generative models. Currently, there are two primary technical approaches in visual generation: diffusion models…

Computer Vision and Pattern Recognition · Computer Science 2024-10-08 Ziyu Yao , Jialin Li , Yifeng Zhou , Yong Liu , Xi Jiang , Chengjie Wang , Feng Zheng , Yuexian Zou , Lei Li