Depth-Adapted CNN for RGB-D cameras

doi:10.48550/arXiv.2009.09976

Depth-Adapted CNN for RGB-D cameras

Conventional 2D Convolutional Neural Networks (CNN) extract features from an input image by applying linear filters. These filters compute the spatial coherence by weighting the photometric information on a fixed neighborhood without taking into account the geometric information. We tackle the problem of improving the classical RGB CNN methods by using the depth information provided by the RGB-D cameras. State-of-the-art approaches use depth as an additional channel or image (HHA) or pass from 2D CNN to 3D CNN. This paper proposes a novel and generic procedure to articulate both photometric and geometric information in CNN architecture. The depth data is represented as a 2D offset to adapt spatial sampling locations. The new model presented is invariant to scale and rotation around the X and the Y axis of the camera coordinate system. Moreover, when depth data is constant, our model is equivalent to a regular CNN. Experiments of benchmarks validate the effectiveness of our model.

Publication:

arXiv e-prints

Pub Date:

September 2020

DOI:

10.48550/arXiv.2009.09976

arXiv:

arXiv:2009.09976

Bibcode:

2020arXiv200909976W

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

Accepted manuscript in ACCV 2020 (Oral)

NASA/ADS

Depth-Adapted CNN for RGB-D cameras

Abstract