English

Prototype-based Dataset Comparison

Computer Vision and Pattern Recognition 2023-09-06 v1 Multimedia

Abstract

Dataset summarisation is a fruitful approach to dataset inspection. However, when applied to a single dataset the discovery of visual concepts is restricted to those most prominent. We argue that a comparative approach can expand upon this paradigm to enable richer forms of dataset inspection that go beyond the most prominent concepts. To enable dataset comparison we present a module that learns concept-level prototypes across datasets. We leverage self-supervised learning to discover these prototypes without supervision, and we demonstrate the benefits of our approach in two case-studies. Our findings show that dataset comparison extends dataset inspection and we hope to encourage more works in this direction. Code and usage instructions available at https://github.com/Nanne/ProtoSim

Keywords

Cite

@article{arxiv.2309.02401,
  title  = {Prototype-based Dataset Comparison},
  author = {Nanne van Noord},
  journal= {arXiv preprint arXiv:2309.02401},
  year   = {2023}
}

Comments

To be presented at ICCV 2023

R2 v1 2026-06-28T12:13:23.782Z