English

Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers

Computer Vision and Pattern Recognition 2026-02-25 v2

Abstract

Diffusion Transformers (DiTs) have shown exceptional performance in image generation, yet their large parameter counts incur high computational costs, impeding deployment in resource-constrained settings. To address this, we propose Pluggable Pruning with Contiguous Layer Distillation (PPCL), a flexible structured pruning framework specifically designed for DiT architectures. First, we identify redundant layer intervals through a linear probing mechanism combined with the first-order differential trend analysis of similarity metrics. Subsequently, we propose a plug-and-play teacher-student alternating distillation scheme tailored to integrate depth-wise and width-wise pruning within a single training phase. This distillation framework enables flexible knowledge transfer across diverse pruning ratios, eliminating the need for per-configuration retraining. Extensive experiments on multiple Multi-Modal Diffusion Transformer architecture models demonstrate that PPCL achieves a 50\% reduction in parameter count compared to the full model, with less than 3\% degradation in key objective metrics. Notably, our method maintains high-quality image generation capabilities while achieving higher compression ratios, rendering it well-suited for resource-constrained environments. The open-source code, checkpoints for PPCL can be found at the following link: https://github.com/OPPO-Mente-Lab/Qwen-Image-Pruning.

Keywords

Cite

@article{arxiv.2511.16156,
  title  = {Pluggable Pruning with Contiguous Layer Distillation for Diffusion Transformers},
  author = {Jian Ma and Qirong Peng and Xujie Zhu and Peixing Xie and Chen Chen and Haonan Lu},
  journal= {arXiv preprint arXiv:2511.16156},
  year   = {2026}
}

Comments

Accepted to CVPR 2026

R2 v1 2026-07-01T07:46:51.295Z