End2End Multi-View Feature Matching with Differentiable Pose Optimization

doi:10.48550/arXiv.2205.01694

End2End Multi-View Feature Matching with Differentiable Pose Optimization

Erroneous feature matches have severe impact on subsequent camera pose estimation and often require additional, time-costly measures, like RANSAC, for outlier rejection. Our method tackles this challenge by addressing feature matching and pose optimization jointly. To this end, we propose a graph attention network to predict image correspondences along with confidence weights. The resulting matches serve as weighted constraints in a differentiable pose estimation. Training feature matching with gradients from pose optimization naturally learns to down-weight outliers and boosts pose estimation on image pairs compared to SuperGlue by 6.7% on ScanNet. At the same time, it reduces the pose estimation time by over 50% and renders RANSAC iterations unnecessary. Moreover, we integrate information from multiple views by spanning the graph across multiple frames to predict the matches all at once. Multi-view matching combined with end-to-end training improves the pose estimation metrics on Matterport3D by 18.5% compared to SuperGlue.

Publication:

arXiv e-prints

Pub Date:

May 2022

DOI:

10.48550/arXiv.2205.01694

arXiv:

arXiv:2205.01694

Bibcode:

2022arXiv220501694R

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

ICCV 2023, project page: https://barbararoessle.github.io/e2e_multi_view_matching , video: https://youtu.be/uuLb6GfM9Cg

NASA/ADS

End2End Multi-View Feature Matching with Differentiable Pose Optimization

Abstract