Deep-Learning Assisted High-Resolution Binocular Stereo Depth Reconstruction
Abstract
This work presents dense stereo reconstruction using high-resolution images for infrastructure inspections. The state-of-the-art stereo reconstruction methods, both learning and non-learning ones, consume too much computational resource on high-resolution data. Recent learning-based methods achieve top ranks on most benchmarks. However, they suffer from the generalization issue due to lack of task-specific training data. We propose to use a less resource demanding non-learning method, guided by a learning-based model, to handle high-resolution images and achieve accurate stereo reconstruction. The deep-learning model produces an initial disparity prediction with uncertainty for each pixel of the down-sampled stereo image pair. The uncertainty serves as a self-measurement of its generalization ability and the per-pixel searching range around the initially predicted disparity. The downstream process performs a modified version of the Semi-Global Block Matching method with the up-sampled per-pixel searching range. The proposed deep-learning assisted method is evaluated on the Middlebury dataset and high-resolution stereo images collected by our customized binocular stereo camera. The combination of learning and non-learning methods achieves better performance on 12 out of 15 cases of the Middlebury dataset. In our infrastructure inspection experiments, the average 3D reconstruction error is less than 0.004m.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2019
- DOI:
- 10.48550/arXiv.1912.05012
- arXiv:
- arXiv:1912.05012
- Bibcode:
- 2019arXiv191205012H
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Robotics
- E-Print:
- Submitted to International Conference on Robotics and Automation (ICRA2020)