English

Stable Diffusion Dataset Generation for Downstream Classification Tasks

Machine Learning 2024-05-07 v1 Artificial Intelligence Computer Vision and Pattern Recognition

Abstract

Recent advances in generative artificial intelligence have enabled the creation of high-quality synthetic data that closely mimics real-world data. This paper explores the adaptation of the Stable Diffusion 2.0 model for generating synthetic datasets, using Transfer Learning, Fine-Tuning and generation parameter optimisation techniques to improve the utility of the dataset for downstream classification tasks. We present a class-conditional version of the model that exploits a Class-Encoder and optimisation of key generation parameters. Our methodology led to synthetic datasets that, in a third of cases, produced models that outperformed those trained on real datasets.

Keywords

Cite

@article{arxiv.2405.02698,
  title  = {Stable Diffusion Dataset Generation for Downstream Classification Tasks},
  author = {Eugenio Lomurno and Matteo D'Oria and Matteo Matteucci},
  journal= {arXiv preprint arXiv:2405.02698},
  year   = {2024}
}
R2 v1 2026-06-28T16:16:43.062Z