Ensembling object detectors for image and video data analysis
Abstract
In this paper, we propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data. We further extend it to video data by proposing a two-stage tracking-based scheme for detection refinement. The proposed method can be used as a standalone approach for improving object detection performance, or as a part of a framework for faster bounding box annotation in unseen datasets, assuming that the objects of interest are those present in some common public datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2021
- DOI:
- 10.48550/arXiv.2102.04798
- arXiv:
- arXiv:2102.04798
- Bibcode:
- 2021arXiv210204798C
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning
- E-Print:
- Accepted to ICASSP 2021.(C)2021 IEEE.Personal use of this material is permitted.Permission from IEEE must be obtained for all other uses,in any current or future media,including reprinting/republishing this material for advertising or promotional purposes,creating new collective works,for resale or redistribution to servers or lists,or reuse of any copyrighted component of this work in other works