Detecting out-of-distribution (OOD) inputs is pivotal for deploying safe vision systems in open-world environments. We revisit diffusion models, not as generators, but as universal perceptual templates for OOD detection. This research explores the use of score-based generative models as foundational tools for semantic anomaly detection across unseen datasets. Specifically, we leverage the denoising trajectories of Denoising Diffusion Models (DDMs) as a rich source of texture and semantic information. By analyzing Stein score errors, amplified through the Structural Similarity Index Metric (SSIM), we introduce a novel method for identifying anomalous samples without requiring re-training on each target dataset. Our approach improves over state-of-the-art and relies on training a single model on one dataset -- CelebA -- which we find to be an effective base distribution, even outperforming more commonly used datasets like ImageNet in several settings. Experimental results show near-perfect performance on some benchmarks, with notable headroom on others, highlighting both the strength and future potential of generative foundation models in anomaly detection.
@article{arxiv.2507.22692,
title = {Zero-Shot Image Anomaly Detection Using Generative Foundation Models},
author = {Lemar Abdi and Amaan Valiuddin and Francisco Caetano and Christiaan Viviers and Fons van der Sommen},
journal= {arXiv preprint arXiv:2507.22692},
year = {2025}
}
Comments
Accepted at the workshop of Anomaly Detection with Foundation Models, ICCV 2025