Revisiting Self-Supervised Monocular Depth Estimation

doi:10.48550/arXiv.2103.12496

Revisiting Self-Supervised Monocular Depth Estimation

Self-supervised learning of depth map prediction and motion estimation from monocular video sequences is of vital importance -- since it realizes a broad range of tasks in robotics and autonomous vehicles. A large number of research efforts have enhanced the performance by tackling illumination variation, occlusions, and dynamic objects, to name a few. However, each of those efforts targets individual goals and endures as separate works. Moreover, most of previous works have adopted the same CNN architecture, not reaping architectural benefits. Therefore, the need to investigate the inter-dependency of the previous methods and the effect of architectural factors remains. To achieve these objectives, we revisit numerous previously proposed self-supervised methods for joint learning of depth and motion, perform a comprehensive empirical study, and unveil multiple crucial insights. Furthermore, we remarkably enhance the performance as a result of our study -- outperforming previous state-of-the-art performance.

Publication:

arXiv e-prints

Pub Date:

March 2021

DOI:

10.48550/arXiv.2103.12496

arXiv:

arXiv:2103.12496

Bibcode:

2021arXiv210312496K

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Artificial Intelligence;
Computer Science - Robotics

E-Print:

14 pages, 3 figures, 4 tables

NASA/ADS

Revisiting Self-Supervised Monocular Depth Estimation

Abstract