English
Related papers

Related papers: DiffSparse: Accelerating Diffusion Transformers wi…

200 papers

Diffusion transformers have shown significant effectiveness in both image and video synthesis at the expense of huge computation costs. To address this problem, feature caching methods have been introduced to accelerate diffusion…

Machine Learning · Computer Science 2025-02-20 Chang Zou , Xuyang Liu , Ting Liu , Siteng Huang , Linfeng Zhang

Diffusion Transformers have recently demonstrated unprecedented generative capabilities for various tasks. The encouraging results, however, come with the cost of slow inference, since each denoising step requires inference on a transformer…

Machine Learning · Computer Science 2024-11-19 Xinyin Ma , Gongfan Fang , Michael Bi Mi , Xinchao Wang

Diffusion transformers have gained substantial interest in diffusion generative modeling due to their outstanding performance. However, their computational demands, particularly the quadratic complexity of attention mechanisms and…

Machine Learning · Computer Science 2026-01-28 Jinming Lou , Wenyang Luo , Yufan Liu , Bing Li , Xinmiao Ding , Weiming Hu , Yuming Li , Chenguang Ma

Transformer-based diffusion models offer superior scalability and performance but suffer from high computational overhead due to the iterative nature and quadratic complexity of self-attention at high resolutions. In this paper, we propose…

Hardware Architecture · Computer Science 2026-05-26 Jieon Yoon , Hangyeol Lee , Jaehoon Heo , Joo-Young Kim

Diffusion Transformers (DiT) are powerful generative models but remain computationally intensive due to their iterative structure and deep transformer stacks. To alleviate this inefficiency, we propose \textbf{FastCache}, a…

Machine Learning · Computer Science 2026-03-30 Dong Liu , Yanxuan Yu , Jiayi Zhang , Yifan Li , Ben Lengerich , Ying Nian Wu

This paper presents a method to accelerate the inference process of diffusion transformer (DiT)-based text-to-speech (TTS) models by applying a selective caching mechanism to transformer layers. Specifically, I integrate SmoothCache into…

Audio and Speech Processing · Electrical Eng. & Systems 2025-09-11 Siratish Sakpiboonchit

Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on…

Machine Learning · Computer Science 2025-04-18 Kafeng Wang , Jianfei Chen , He Li , Zhenpeng Mi , Jun Zhu

Image tokenization plays a central role in modern generative modeling by mapping visual inputs into compact representations that serve as an intermediate signal between pixels and generative models. Diffusion-based decoders have recently…

Computer Vision and Pattern Recognition · Computer Science 2026-03-23 Chuhan Wang , Hao Chen

Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Pengtao Chen , Mingzhu Shen , Peng Ye , Jianjian Cao , Chongjun Tu , Christos-Savvas Bouganis , Yiren Zhao , Tao Chen

Diffusion Transformers (DiTs) have achieved state-of-the-art performance in generative modeling, yet their high computational cost hinders real-time deployment. While feature caching offers a promising training-free acceleration solution by…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Fanpu Cao , Yaofo Chen , Zeng You , Wei Luo

Diffusion Transformers (DiT) have become the dominant methods in image and video generation yet still suffer substantial computational costs. As an effective approach for DiT acceleration, feature caching methods are designed to cache the…

Machine Learning · Computer Science 2025-11-19 Chang Zou , Evelyn Zhang , Runlin Guo , Haohang Xu , Conghui He , Xuming Hu , Linfeng Zhang

Diffusion Transformers (DiT) are renowned for their impressive generative performance; however, they are significantly constrained by considerable computational costs due to the quadratic complexity in self-attention and the extensive…

Computer Vision and Pattern Recognition · Computer Science 2025-09-24 Shuning Chang , Pichao Wang , Jiasheng Tang , Fan Wang , Yi Yang

Diffusion models have recently gained unprecedented attention in the field of image synthesis due to their remarkable generative capabilities. Notwithstanding their prowess, these models often incur substantial computational costs,…

Computer Vision and Pattern Recognition · Computer Science 2023-12-11 Xinyin Ma , Gongfan Fang , Xinchao Wang

Diffusion transformers have gained significant attention in recent years for their ability to generate high-quality images and videos, yet still suffer from a huge computational cost due to their iterative denoising process. Recently,…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Zhixin Zheng , Xinyu Wang , Chang Zou , Shaobo Wang , Linfeng Zhang

Recent years have witnessed the rapid development of acceleration techniques for diffusion models, especially caching-based acceleration methods. These studies seek to answer two fundamental questions: "When to cache" and "How to use…

Computer Vision and Pattern Recognition · Computer Science 2025-10-03 Jiazi Bu , Pengyang Ling , Yujie Zhou , Yibin Wang , Yuhang Zang , Dahua Lin , Jiaqi Wang

While diffusion models have achieved great success in the field of video generation, this progress is accompanied by a rapidly escalating computational burden. Among the existing acceleration methods, Feature Caching is popular due to its…

Computer Vision and Pattern Recognition · Computer Science 2026-04-21 Chang Zou , Changlin Li , Yang Li , Patrol Li , Jianbing Wu , Xiao He , Songtao Liu , Zhao Zhong , Kailin Huang , Linfeng Zhang

Diffusion models produce high quality images but inference is costly due to many denoising steps and heavy matrix operations. We present DiffPro, a post-training, hardware-faithful framework that works with the exact integer kernels used in…

Machine Learning · Computer Science 2025-11-17 Farhana Amin , Sabiha Afroz , Kanchon Gharami , Mona Moghadampanah , Dimitrios S. Nikolopoulos

Diffusion-based image generation models excel at producing high-quality synthetic content, but suffer from slow and computationally expensive inference. Prior work has attempted to mitigate this by caching and reusing features within…

Computer Vision and Pattern Recognition · Computer Science 2026-03-04 Anirud Aggarwal , Abhinav Shrivastava , Matthew Gwilliam

Efficient Transformers have been developed for long sequence modeling, due to their subquadratic memory and time complexity. Sparse Transformer is a popular approach to improving the efficiency of Transformers by restricting self-attention…

Machine Learning · Computer Science 2023-02-01 Aosong Feng , Irene Li , Yuang Jiang , Rex Ying

Diffusion models deliver high-fidelity synthesis but remain slow due to iterative sampling. We empirically observe there exists feature invariance in deterministic sampling, and present InvarDiff, a training-free acceleration method that…

Computer Vision and Pattern Recognition · Computer Science 2025-12-08 Zihao Wu
‹ Prev 1 2 3 10 Next ›