Related papers: Predict to Skip: Linear Multistep Feature Forecast…

Dynamic Diffusion Transformer

Diffusion Transformer (DiT), an emerging diffusion model for image generation, has demonstrated superior performance but suffers from substantial computational costs. Our investigations reveal that these costs stem from the static inference…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Wangbo Zhao , Yizeng Han , Jiasheng Tang , Kai Wang , Yibing Song , Gao Huang , Fan Wang , Yang You

DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation

Diffusion Transformer (DiT), an emerging diffusion model for visual generation, has demonstrated superior performance but suffers from substantial computational costs. Our investigations reveal that these costs primarily stem from the…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Wangbo Zhao , Yizeng Han , Jiasheng Tang , Kai Wang , Hao Luo , Yibing Song , Gao Huang , Fan Wang , Yang You

From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers

Diffusion Transformers (DiT) have revolutionized high-fidelity image and video synthesis, yet their computational demands remain prohibitive for real-time applications. To solve this problem, feature caching has been proposed to accelerate…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jiacheng Liu , Chang Zou , Yuanhuiyi Lyu , Junjie Chen , Linfeng Zhang

$\Delta$-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers

Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Pengtao Chen , Mingzhu Shen , Peng Ye , Jianjian Cao , Chongjun Tu , Christos-Savvas Bouganis , Yiren Zhao , Tao Chen

ProCache: Constraint-Aware Feature Caching with Selective Computation for Diffusion Transformer Acceleration

Diffusion Transformers (DiTs) have achieved state-of-the-art performance in generative modeling, yet their high computational cost hinders real-time deployment. While feature caching offers a promising training-free acceleration solution by…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Fanpu Cao , Yaofo Chen , Zeng You , Wei Luo

FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute

Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we…

Machine Learning · Computer Science 2025-02-28 Sotiris Anagnostidis , Gregor Bachmann , Yeongmin Kim , Jonas Kohler , Markos Georgopoulos , Artsiom Sanakoyeu , Yuming Du , Albert Pumarola , Ali Thabet , Edgar Schönfeld

RAPID^3: Tri-Level Reinforced Acceleration Policies for Diffusion Transformer

Diffusion Transformers (DiTs) excel at visual generation yet remain hampered by slow sampling. Existing training-free accelerators - step reduction, feature caching, and sparse attention - enhance inference speed but typically rely on a…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Wangbo Zhao , Yizeng Han , Zhiwei Tang , Jiasheng Tang , Pengfei Zhou , Kai Wang , Bohan Zhuang , Zhangyang Wang , Fan Wang , Yang You

Diffusion Transformers (DiTs) have demonstrated remarkable generative capabilities, particularly benefiting from Transformer architectures that enhance visual and artistic fidelity. However, their inherently sequential denoising process…

Computer Vision and Pattern Recognition · Computer Science 2026-01-08 Hanqi Chen , Xu Zhang , Xiaoliu Guan , Lielin Jiang , Guanzhong Wang , Zeyu Chen , Yi Liu

Forecasting When to Forecast: Accelerating Diffusion Models with Confidence-Gated Taylor

Diffusion Transformers (DiTs) have demonstrated remarkable performance in visual generation tasks. However, their low inference speed limits their deployment in low-resource applications. Recent training-free approaches exploit the…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Xiaoliu Guan , Lielin Jiang , Hanqi Chen , Xu Zhang , Jiaxing Yan , Guanzhong Wang , Yi Liu , Zetao Zhang , Yu Wu

FIS-DiT: Breaking the Few-Step Video Inference Barrier via Training-Free Frame Interleaved Sparsity

While the overall inference latency of Video Diffusion Transformers (DiTs) can be substantially reduced through model distillation, per-step inference latency remains a critical bottleneck. Existing acceleration paradigms primarily exploit…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Jian Tang , Jiawei Fan , Qingbin Liu , Zheng Wei

Beyond Fixed Formulas: Data-Driven Linear Predictor for Efficient Diffusion Models

To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping. We…

Computer Vision and Pattern Recognition · Computer Science 2026-04-30 Zhirong Shen , Rui Huang , Jiacheng Liu , Chang Zou , Peiliang Cai , Shikang Zheng , Zhengyi Shi , Liang Feng , Linfeng Zhang

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

Diffusion models have recently achieved great success in the synthesis of high-quality images and videos. However, the existing denoising techniques in diffusion models are commonly based on step-by-step noise predictions, which suffers…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Hancheng Ye , Jiakang Yuan , Renqiu Xia , Xiangchao Yan , Tao Chen , Junchi Yan , Botian Shi , Bo Zhang

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Recent advances in diffusion models have demonstrated remarkable capabilities in video generation. However, the computational intensity remains a significant challenge for practical applications. While feature caching has been proposed to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Xuran Ma , Yexin Liu , Yaofu Liu , Xianfeng Wu , Mingzhe Zheng , Zihao Wang , Ser-Nam Lim , Harry Yang

Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints

Diffusion Transformers (DiT) have emerged as a powerful architecture for image and video generation, offering superior quality and scalability. However, their practical application suffers from inherent dynamic feature instability, leading…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Guanjie Chen , Xinyu Zhao , Yucheng Zhou , Xiaoye Qu , Tianlong Chen , Yu Cheng

FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification

Diffusion Transformers (DiT) have attracted significant attention in research. However, they suffer from a slow convergence rate. In this paper, we aim to accelerate DiT training without any architectural modification. We identify the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Jingfeng Yao , Wang Cheng , Wenyu Liu , Xinggang Wang

A-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation Learning

Diffusion models have significantly reshaped the field of generative artificial intelligence and are now increasingly explored for their capacity in discriminative representation learning. Diffusion Transformer (DiT) has recently gained…

Computer Vision and Pattern Recognition · Computer Science 2026-03-30 Changyu Liu , James Chenhao Liang , Wenhao Yang , Yiming Cui , Jinghao Yang , Tianyang Wang , Qifan Wang , Dongfang Liu , Cheng Han

Elastic Diffusion Transformer

Diffusion Transformers (DiT) have demonstrated remarkable generative capabilities but remain highly computationally expensive. Previous acceleration methods, such as pruning and distillation, typically rely on a fixed computational…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Jiangshan Wang , Zeqiang Lai , Jiarui Chen , Jiayi Guo , Hang Guo , Xiu Li , Xiangyu Yue , Chunchao Guo

FREE: Uncertainty-Aware Autoregression for Parallel Diffusion Transformers

Diffusion Transformers (DiTs) achieve state-of-the-art generation quality but require long sequential denoising trajectories, leading to high inference latency. Recent speculative inference methods enable lossless parallel sampling in…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Xinwan Wen , Bowen Li , Jiajun Luo , Ye Li , Zhi Wang

LESA: Learnable Stage-Aware Predictors for Diffusion Model Acceleration

Diffusion models have achieved remarkable success in image and video generation tasks. However, the high computational demands of Diffusion Transformers (DiTs) pose a significant challenge to their practical deployment. While feature…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Peiliang Cai , Jiacheng Liu , Haowen Xu , Xinyu Wang , Chang Zou , Linfeng Zhang

Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration

Diffusion models have become the dominant tool for high-fidelity image and video generation, yet are critically bottlenecked by their inference speed due to the numerous iterative passes of Diffusion Transformers. To reduce the exhaustive…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Jiaqi Han , Juntong Shi , Puheng Li , Haotian Ye , Qiushan Guo , Stefano Ermon