SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow
Abstract
Increasing the annotation efficiency of trajectory annotations from videos has the potential to enable the next generation of data-hungry tracking algorithms to thrive on large-scale datasets. Despite the importance of this task, there are currently very few works exploring how to efficiently label tracking datasets comprehensively. In this work, we introduce SPAM, a video label engine that provides high-quality labels with minimal human intervention. SPAM is built around two key insights: i) most tracking scenarios can be easily resolved. To take advantage of this, we utilize a pre-trained model to generate high-quality pseudo-labels, reserving human involvement for a smaller subset of more difficult instances; ii) handling the spatiotemporal dependencies of track annotations across time can be elegantly and efficiently formulated through graphs. Therefore, we use a unified graph formulation to address the annotation of both detections and identity association for tracks across time. Based on these insights, SPAM produces high-quality annotations with a fraction of ground truth labeling cost. We demonstrate that trackers trained on SPAM labels achieve comparable performance to those trained on human annotations while requiring only $3-20\%$ of the human labeling effort. Hence, SPAM paves the way towards highly efficient labeling of large-scale tracking datasets. We release all models and code.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2024
- DOI:
- 10.48550/arXiv.2404.11426
- arXiv:
- arXiv:2404.11426
- Bibcode:
- 2024arXiv240411426C
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- ECCV 2024