Related papers: Learning-to-Cache: Accelerating Diffusion Transfor…

DiffSparse: Accelerating Diffusion Transformers with Learned Token Sparsity

Diffusion models demonstrate outstanding performance in image generation, but their multi-step inference mechanism requires immense computational cost. Previous works accelerate inference by leveraging layer or token cache techniques to…

Computer Vision and Pattern Recognition · Computer Science 2026-04-07 Haowei Zhu , Ji Liu , Ziqiong Liu , Dong Li , Junhai Yong , Bin Wang , Emad Barsoum

FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation

Diffusion Transformers (DiT) are powerful generative models but remain computationally intensive due to their iterative structure and deep transformer stacks. To alleviate this inefficiency, we propose \textbf{FastCache}, a…

Machine Learning · Computer Science 2026-03-30 Dong Liu , Yanxuan Yu , Jiayi Zhang , Yifan Li , Ben Lengerich , Ying Nian Wu

H2-Cache: A Novel Hierarchical Dual-Stage Cache for High-Performance Acceleration of Generative Diffusion Models

Diffusion models have emerged as state-of-the-art in image generation, but their practical deployment is hindered by the significant computational cost of their iterative denoising process. While existing caching techniques can accelerate…

Computer Vision and Pattern Recognition · Computer Science 2025-11-06 Mingyu Sung , Il-Min Kim , Sangseok Yun , Jae-Mo Kang

DiCache: Let Diffusion Model Determine Its Own Cache

Recent years have witnessed the rapid development of acceleration techniques for diffusion models, especially caching-based acceleration methods. These studies seek to answer two fundamental questions: "When to cache" and "How to use…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Jiazi Bu , Pengyang Ling , Yujie Zhou , Yibin Wang , Yuhang Zang , Dahua Lin , Jiaqi Wang

Token Caching for Diffusion Transformer Acceleration

Diffusion transformers have gained substantial interest in diffusion generative modeling due to their outstanding performance. However, their computational demands, particularly the quadratic complexity of attention mechanisms and…

Machine Learning · Computer Science 2026-01-28 Jinming Lou , Wenyang Luo , Yufan Liu , Bing Li , Xinmiao Ding , Weiming Hu , Yuming Li , Chenguang Ma

DeepCache: Accelerating Diffusion Models for Free

Diffusion models have recently gained unprecedented attention in the field of image synthesis due to their remarkable generative capabilities. Notwithstanding their prowess, these models often incur substantial computational costs,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-11 Xinyin Ma , Gongfan Fang , Xinchao Wang

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Feature caching has emerged as an effective strategy to accelerate diffusion transformer (DiT) sampling through temporal feature reuse. It is a challenging problem since (1) Progressive error accumulation from cached blocks significantly…

Computer Vision and Pattern Recognition · Computer Science 2025-07-21 Junxiang Qiu , Lin Liu , Shuo Wang , Jinda Lu , Kezhou Chen , Yanbin Hao

Accelerating Diffusion Transformer via Error-Optimized Cache

Diffusion Transformer (DiT) is a crucial method for content generation. However, it needs a lot of time to sample. Many studies have attempted to use caching to reduce the time consumption of sampling. Existing caching methods accelerate…

Computer Vision and Pattern Recognition · Computer Science 2025-07-21 Junxiang Qiu , Shuo Wang , Jinda Lu , Lin Liu , Houcheng Jiang , Xingyu Zhu , Yanbin Hao

ProCache: Constraint-Aware Feature Caching with Selective Computation for Diffusion Transformer Acceleration

Diffusion Transformers (DiTs) have achieved state-of-the-art performance in generative modeling, yet their high computational cost hinders real-time deployment. While feature caching offers a promising training-free acceleration solution by…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Fanpu Cao , Yaofo Chen , Zeng You , Wei Luo

LightCache: Memory-Efficient, Training-Free Acceleration for Video Generation

Training-free acceleration has emerged as an advanced research area in video generation based on diffusion models. The redundancy of latents in diffusion model inference provides a natural entry point for acceleration. In this paper, we…

Computer Vision and Pattern Recognition · Computer Science 2025-10-08 Yang Xiao , Gen Li , Kaiyuan Deng , Yushu Wu , Zheng Zhan , Yanzhi Wang , Xiaolong Ma , Bo Hui

Accelerating Diffusion Transformers with Token-wise Feature Caching

Diffusion transformers have shown significant effectiveness in both image and video synthesis at the expense of huge computation costs. To address this problem, feature caching methods have been introduced to accelerate diffusion…

Machine Learning · Computer Science 2025-02-20 Chang Zou , Xuyang Liu , Ting Liu , Siteng Huang , Linfeng Zhang

A Survey on Cache Methods in Diffusion Models: Toward Efficient Multi-Modal Generation

Diffusion Models have become a cornerstone of modern generative AI for their exceptional generation quality and controllability. However, their inherent \textit{multi-step iterations} and \textit{complex backbone networks} lead to…

Machine Learning · Computer Science 2025-11-04 Jiacheng Liu , Xinyu Wang , Yuqi Lin , Zhikai Wang , Peiru Wang , Peiliang Cai , Qinming Zhou , Zhengan Yan , Zexuan Yan , Zhengyi Shi , Chang Zou , Yue Ma , Linfeng Zhang

Cache Me if You Can: Accelerating Diffusion Models through Block Caching

Diffusion models have recently revolutionized the field of image synthesis due to their ability to generate photorealistic images. However, one of the major drawbacks of diffusion models is that the image generation process is costly. A…

Computer Vision and Pattern Recognition · Computer Science 2024-01-15 Felix Wimbauer , Bichen Wu , Edgar Schoenfeld , Xiaoliang Dai , Ji Hou , Zijian He , Artsiom Sanakoyeu , Peizhao Zhang , Sam Tsai , Jonas Kohler , Christian Rupprecht , Daniel Cremers , Peter Vajda , Jialiang Wang

KDC-Diff: A Latent-Aware Diffusion Model with Knowledge Retention for Memory-Efficient Image Generation

The growing adoption of generative AI in real-world applications has exposed a critical bottleneck in the computational demands of diffusion-based text-to-image models. In this work, we propose KDC-Diff, a novel and scalable generative…

Computer Vision and Pattern Recognition · Computer Science 2025-10-01 Md. Naimur Asif Borno , Md Sakib Hossain Shovon , Asmaa Soliman Al-Moisheer , Mohammad Ali Moni

d$^2$Cache: Accelerating Diffusion-Based LLMs via Dual Adaptive Caching

Diffusion-based large language models (dLLMs), despite their promising performance, still suffer from inferior inference efficiency. This is because dLLMs rely on bidirectional attention and cannot directly benefit from the standard…

Computation and Language · Computer Science 2026-02-17 Yuchu Jiang , Yue Cai , Xiangzhong Luo , Jiale Fu , Jiarui Wang , Chonghan Liu , Xu Yang

Frequency-Aware Error-Bounded Caching for Accelerating Diffusion Transformers

Diffusion Transformers (DiTs) have emerged as the dominant architecture for high-quality image and video generation, yet their iterative denoising process incurs substantial computational cost during inference. Existing caching methods…

Computer Vision and Pattern Recognition · Computer Science 2026-03-06 Guandong Li

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Generating temporally-consistent high-fidelity videos can be computationally expensive, especially over longer temporal spans. More-recent Diffusion Transformers (DiTs) -- despite making significant headway in this context -- have only…

Computer Vision and Pattern Recognition · Computer Science 2024-11-08 Kumara Kahatapitiya , Haozhe Liu , Sen He , Ding Liu , Menglin Jia , Chenyang Zhang , Michael S. Ryoo , Tian Xie

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Diffusion Transformers have emerged as the preeminent models for a wide array of generative tasks, demonstrating superior performance and efficacy across various applications. The promising results come at the cost of slow inference, as…

Machine Learning · Computer Science 2025-03-24 Xuan Shen , Zhao Song , Yufa Zhou , Bo Chen , Yanyu Li , Yifan Gong , Kai Zhang , Hao Tan , Jason Kuen , Henghui Ding , Zhihao Shu , Wei Niu , Pu Zhao , Yanzhi Wang , Jiuxiang Gu

Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep

Diffusion-based video editing has emerged as an important paradigm for high-quality and flexible content generation. However, despite their generality and strong modeling capacity, Diffusion Transformers (DiT) remain computationally…

Computer Vision and Pattern Recognition · Computer Science 2026-03-26 Tianyi Liu , Ye Lu , Linfeng Zhang , Chen Cai , Jianjun Gao , Yi Wang , Kim-Hui Yap , Lap-Pui Chau

HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration

Diffusion Transformers (DiTs) excel in generative tasks but face practical deployment challenges due to high inference costs. Feature caching, which stores and retrieves redundant computations, offers the potential for acceleration.…

Computer Vision and Pattern Recognition · Computer Science 2025-06-03 Yushi Huang , Zining Wang , Ruihao Gong , Jing Liu , Xinjie Zhang , Jinyang Guo , Xianglong Liu , Jun Zhang