Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches
Abstract
We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2015
- DOI:
- 10.48550/arXiv.1510.05970
- arXiv:
- arXiv:1510.05970
- Bibcode:
- 2015arXiv151005970Z
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Machine Learning;
- Computer Science - Neural and Evolutionary Computing
- E-Print:
- JMLR 17(65):1-32, 2016