Prototype-based Dataset Comparison
Abstract
Dataset summarisation is a fruitful approach to dataset inspection. However, when applied to a single dataset the discovery of visual concepts is restricted to those most prominent. We argue that a comparative approach can expand upon this paradigm to enable richer forms of dataset inspection that go beyond the most prominent concepts. To enable dataset comparison we present a module that learns concept-level prototypes across datasets. We leverage self-supervised learning to discover these prototypes without supervision, and we demonstrate the benefits of our approach in two case-studies. Our findings show that dataset comparison extends dataset inspection and we hope to encourage more works in this direction. Code and usage instructions available at https://github.com/Nanne/ProtoSim
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2023
- DOI:
- 10.48550/arXiv.2309.02401
- arXiv:
- arXiv:2309.02401
- Bibcode:
- 2023arXiv230902401V
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Multimedia
- E-Print:
- To be presented at ICCV 2023