Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis

doi:10.48550/arXiv.2005.05179

Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis

Visual Localization is one of the key enabling technologies for autonomous driving and augmented reality. High quality datasets with accurate 6 Degree-of-Freedom (DoF) reference poses are the foundation for benchmarking and improving existing methods. Traditionally, reference poses have been obtained via Structure-from-Motion (SfM). However, SfM itself relies on local features which are prone to fail when images were taken under different conditions, e.g., day/ night changes. At the same time, manually annotating feature correspondences is not scalable and potentially inaccurate. In this work, we propose a semi-automated approach to generate reference poses based on feature matching between renderings of a 3D model and real images via learned features. Given an initial pose estimate, our approach iteratively refines the pose based on feature matches against a rendering of the model from the current pose estimate. We significantly improve the nighttime reference poses of the popular Aachen Day-Night dataset, showing that state-of-the-art visual localization methods perform better (up to $47\%$) than predicted by the original reference poses. We extend the dataset with new nighttime test images, provide uncertainty estimates for our new reference poses, and introduce a new evaluation criterion. We will make our reference poses and our framework publicly available upon publication.

Publication:

arXiv e-prints

Pub Date:

May 2020

DOI:

10.48550/arXiv.2005.05179

arXiv:

arXiv:2005.05179

Bibcode:

2020arXiv200505179Z

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

25 pages, 16 figures. Int J Comput Vis (2020)

NASA/ADS

Reference Pose Generation for Long-term Visual Localization via Learned Features and View Synthesis

Abstract