Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

doi:10.48550/arXiv.1604.05096

Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

Recent approaches for instance-aware semantic labeling have augmented convolutional neural networks (CNNs) with complex multi-task architectures or computationally expensive graphical models. We present a method that leverages a fully convolutional network (FCN) to predict semantic labels, depth and an instance-based encoding using each pixel's direction towards its corresponding instance center. Subsequently, we apply low-level computer vision techniques to generate state-of-the-art instance segmentation on the street scene datasets KITTI and Cityscapes. Our approach outperforms existing works by a large margin and can additionally predict absolute distances of individual instances from a monocular image as well as a pixel-level semantic labeling.

Publication:

arXiv e-prints

Pub Date:

April 2016

DOI:

10.48550/arXiv.1604.05096

arXiv:

arXiv:1604.05096

Bibcode:

2016arXiv160405096U

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

Accepted at GCPR 2016. Includes supplementary material

NASA/ADS

Pixel-level Encoding and Depth Layering for Instance-level Semantic Labeling

Abstract