English

Feature Space Analysis by Guided Diffusion Model

Computer Vision and Pattern Recognition 2025-09-30 v2 Image and Video Processing

Abstract

One of the key issues in Deep Neural Networks (DNNs) is the black-box nature of their internal feature extraction process. Targeting vision-related domains, this paper focuses on analysing the feature space of a DNN by proposing a decoder that can generate images whose features are guaranteed to closely match a user-specified feature. Owing to this guarantee that is missed in past studies, our decoder allows us to evidence which of various image attributes are encoded into the user-specified feature. Our decoder is implemented as a guided diffusion model that guides the reverse image generation of a pre-trained diffusion model to minimise the Euclidean distance between the feature of a clean image estimated at each step and the user-specified feature. One practical advantage of our decoder is that it can analyse feature spaces of different DNNs with no additional training and run on a single COTS GPU. The experimental results targeting CLIP's image encoder, ResNet-50 and vision transformer demonstrate that images generated by our decoder have features remarkably similar to the user-specified ones and reveal valuable insights into these DNNs' feature spaces.

Keywords

Cite

@article{arxiv.2509.07936,
  title  = {Feature Space Analysis by Guided Diffusion Model},
  author = {Kimiaki Shirahama and Miki Yanobu and Kaduki Yamashita and Miho Ohsaki},
  journal= {arXiv preprint arXiv:2509.07936},
  year   = {2025}
}

Comments

37 pages, 13 figures, codes: https://github.com/KimiakiShirahama/FeatureSpaceAnalysisByGuidedDiffusionModel

R2 v1 2026-07-01T05:28:47.201Z