English
Related papers

Related papers: Predict to Skip: Linear Multistep Feature Forecast…

200 papers

Diffusion Transformer (DiT), an emerging diffusion model for image generation, has demonstrated superior performance but suffers from substantial computational costs. Our investigations reveal that these costs stem from the static inference…

Computer Vision and Pattern Recognition · Computer Science 2024-10-10 Wangbo Zhao , Yizeng Han , Jiasheng Tang , Kai Wang , Yibing Song , Gao Huang , Fan Wang , Yang You

Diffusion Transformer (DiT), an emerging diffusion model for visual generation, has demonstrated superior performance but suffers from substantial computational costs. Our investigations reveal that these costs primarily stem from the…

Computer Vision and Pattern Recognition · Computer Science 2026-01-15 Wangbo Zhao , Yizeng Han , Jiasheng Tang , Kai Wang , Hao Luo , Yibing Song , Gao Huang , Fan Wang , Yang You

Diffusion Transformers (DiT) have revolutionized high-fidelity image and video synthesis, yet their computational demands remain prohibitive for real-time applications. To solve this problem, feature caching has been proposed to accelerate…

Computer Vision and Pattern Recognition · Computer Science 2025-08-12 Jiacheng Liu , Chang Zou , Yuanhuiyi Lyu , Junjie Chen , Linfeng Zhang

Diffusion models are widely recognized for generating high-quality and diverse images, but their poor real-time performance has led to numerous acceleration works, primarily focusing on UNet-based structures. With the more successful…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Pengtao Chen , Mingzhu Shen , Peng Ye , Jianjian Cao , Chongjun Tu , Christos-Savvas Bouganis , Yiren Zhao , Tao Chen

Diffusion Transformers (DiTs) have achieved state-of-the-art performance in generative modeling, yet their high computational cost hinders real-time deployment. While feature caching offers a promising training-free acceleration solution by…

Computer Vision and Pattern Recognition · Computer Science 2026-02-16 Fanpu Cao , Yaofo Chen , Zeng You , Wei Luo

Despite their remarkable performance, modern Diffusion Transformers are hindered by substantial resource requirements during inference, stemming from the fixed and large amount of compute needed for each denoising step. In this work, we…

Diffusion Transformers (DiTs) excel at visual generation yet remain hampered by slow sampling. Existing training-free accelerators - step reduction, feature caching, and sparse attention - enhance inference speed but typically rely on a…

Computer Vision and Pattern Recognition · Computer Science 2025-09-29 Wangbo Zhao , Yizeng Han , Zhiwei Tang , Jiasheng Tang , Pengfei Zhou , Kai Wang , Bohan Zhuang , Zhangyang Wang , Fan Wang , Yang You

Diffusion Transformers (DiTs) have demonstrated remarkable generative capabilities, particularly benefiting from Transformer architectures that enhance visual and artistic fidelity. However, their inherently sequential denoising process…

Computer Vision and Pattern Recognition · Computer Science 2026-01-08 Hanqi Chen , Xu Zhang , Xiaoliu Guan , Lielin Jiang , Guanzhong Wang , Zeyu Chen , Yi Liu

Diffusion Transformers (DiTs) have demonstrated remarkable performance in visual generation tasks. However, their low inference speed limits their deployment in low-resource applications. Recent training-free approaches exploit the…

Computer Vision and Pattern Recognition · Computer Science 2025-11-11 Xiaoliu Guan , Lielin Jiang , Hanqi Chen , Xu Zhang , Jiaxing Yan , Guanzhong Wang , Yi Liu , Zetao Zhang , Yu Wu

While the overall inference latency of Video Diffusion Transformers (DiTs) can be substantially reduced through model distillation, per-step inference latency remains a critical bottleneck. Existing acceleration paradigms primarily exploit…

Computer Vision and Pattern Recognition · Computer Science 2026-05-13 Jian Tang , Jiawei Fan , Qingbin Liu , Zheng Wei

To address the high sampling cost of Diffusion Transformers (DiTs), feature caching offers a training-free acceleration method. However, existing methods rely on hand-crafted forecasting formulas that fail under aggressive skipping. We…

Computer Vision and Pattern Recognition · Computer Science 2026-04-30 Zhirong Shen , Rui Huang , Jiacheng Liu , Chang Zou , Peiliang Cai , Shikang Zheng , Zhengyi Shi , Liang Feng , Linfeng Zhang

Diffusion models have recently achieved great success in the synthesis of high-quality images and videos. However, the existing denoising techniques in diffusion models are commonly based on step-by-step noise predictions, which suffers…

Computer Vision and Pattern Recognition · Computer Science 2024-10-15 Hancheng Ye , Jiakang Yuan , Renqiu Xia , Xiangchao Yan , Tao Chen , Junchi Yan , Botian Shi , Bo Zhang

Recent advances in diffusion models have demonstrated remarkable capabilities in video generation. However, the computational intensity remains a significant challenge for practical applications. While feature caching has been proposed to…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Xuran Ma , Yexin Liu , Yaofu Liu , Xianfeng Wu , Mingzhe Zheng , Zihao Wang , Ser-Nam Lim , Harry Yang

Diffusion Transformers (DiT) have emerged as a powerful architecture for image and video generation, offering superior quality and scalability. However, their practical application suffers from inherent dynamic feature instability, leading…

Computer Vision and Pattern Recognition · Computer Science 2026-02-10 Guanjie Chen , Xinyu Zhao , Yucheng Zhou , Xiaoye Qu , Tianlong Chen , Yu Cheng

Diffusion Transformers (DiT) have attracted significant attention in research. However, they suffer from a slow convergence rate. In this paper, we aim to accelerate DiT training without any architectural modification. We identify the…

Computer Vision and Pattern Recognition · Computer Science 2024-11-01 Jingfeng Yao , Wang Cheng , Wenyu Liu , Xinggang Wang

Diffusion models have significantly reshaped the field of generative artificial intelligence and are now increasingly explored for their capacity in discriminative representation learning. Diffusion Transformer (DiT) has recently gained…

Computer Vision and Pattern Recognition · Computer Science 2026-03-30 Changyu Liu , James Chenhao Liang , Wenhao Yang , Yiming Cui , Jinghao Yang , Tianyang Wang , Qifan Wang , Dongfang Liu , Cheng Han

Diffusion Transformers (DiT) have demonstrated remarkable generative capabilities but remain highly computationally expensive. Previous acceleration methods, such as pruning and distillation, typically rely on a fixed computational…

Computer Vision and Pattern Recognition · Computer Science 2026-02-17 Jiangshan Wang , Zeqiang Lai , Jiarui Chen , Jiayi Guo , Hang Guo , Xiu Li , Xiangyu Yue , Chunchao Guo

Diffusion Transformers (DiTs) achieve state-of-the-art generation quality but require long sequential denoising trajectories, leading to high inference latency. Recent speculative inference methods enable lossless parallel sampling in…

Computer Vision and Pattern Recognition · Computer Science 2025-11-26 Xinwan Wen , Bowen Li , Jiajun Luo , Ye Li , Zhi Wang

Diffusion models have achieved remarkable success in image and video generation tasks. However, the high computational demands of Diffusion Transformers (DiTs) pose a significant challenge to their practical deployment. While feature…

Computer Vision and Pattern Recognition · Computer Science 2026-05-28 Peiliang Cai , Jiacheng Liu , Haowen Xu , Xinyu Wang , Chang Zou , Linfeng Zhang

Diffusion models have become the dominant tool for high-fidelity image and video generation, yet are critically bottlenecked by their inference speed due to the numerous iterative passes of Diffusion Transformers. To reduce the exhaustive…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Jiaqi Han , Juntong Shi , Puheng Li , Haotian Ye , Qiushan Guo , Stefano Ermon
‹ Prev 1 2 3 10 Next ›