MFuseNet: Robust Depth Estimation with Learned Multiscopic Fusion
Abstract
We design a multiscopic vision system that utilizes a low-cost monocular RGB camera to acquire accurate depth estimation. Unlike multi-view stereo with images captured at unconstrained camera poses, the proposed system controls the motion of a camera to capture a sequence of images in horizontally or vertically aligned positions with the same parallax. In this system, we propose a new heuristic method and a robust learning-based method to fuse multiple cost volumes between the reference image and its surrounding images. To obtain training data, we build a synthetic dataset with multiscopic images. The experiments on the real-world Middlebury dataset and real robot demonstration show that our multiscopic vision system outperforms traditional two-frame stereo matching methods in depth estimation. Our code and dataset are available at https://sites.google.com/view/multiscopic.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2021
- DOI:
- 10.48550/arXiv.2108.02448
- arXiv:
- arXiv:2108.02448
- Bibcode:
- 2021arXiv210802448Y
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Robotics
- E-Print:
- IEEE International Conference on Robotics and Automation (ICRA) + IEEE Robotics and Automation Letters (RA-L). arXiv admin note: substantial text overlap with arXiv:2001.08212