DDCNet: Deep Dilated Convolutional Neural Network for Dense Prediction
Abstract
Dense pixel matching problems such as optical flow and disparity estimation are among the most challenging tasks in computer vision. Recently, several deep learning methods designed for these problems have been successful. A sufficiently larger effective receptive field (ERF) and a higher resolution of spatial features within a network are essential for providing higher-resolution dense estimates. In this work, we present a systemic approach to design network architectures that can provide a larger receptive field while maintaining a higher spatial feature resolution. To achieve a larger ERF, we utilized dilated convolutional layers. By aggressively increasing dilation rates in the deeper layers, we were able to achieve a sufficiently larger ERF with a significantly fewer number of trainable parameters. We used optical flow estimation problem as the primary benchmark to illustrate our network design strategy. The benchmark results (Sintel, KITTI, and Middlebury) indicate that our compact networks can achieve comparable performance in the class of lightweight networks.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2021
- DOI:
- 10.48550/arXiv.2107.04715
- arXiv:
- arXiv:2107.04715
- Bibcode:
- 2021arXiv210704715S
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- 32 pages, 13 figures, 3 tables