Large-Scale Classification of Structured Objects using a CRF with Deep Class Embedding

doi:10.48550/arXiv.1705.07420

Large-Scale Classification of Structured Objects using a CRF with Deep Class Embedding

This paper presents a novel deep learning architecture to classify structured objects in datasets with a large number of visually similar categories. We model sequences of images as linear-chain CRFs, and jointly learn the parameters from both local-visual features and neighboring classes. The visual features are computed by convolutional layers, and the class embeddings are learned by factorizing the CRF pairwise potential matrix. This forms a highly nonlinear objective function which is trained by optimizing a local likelihood approximation with batch-normalization. This model overcomes the difficulties of existing CRF methods to learn the contextual relationships thoroughly when there is a large number of classes and the data is sparse. The performance of the proposed method is illustrated on a huge dataset that contains images of retail-store product displays, taken in varying settings and viewpoints, and shows significantly improved results compared to linear CRF modeling and unnormalized likelihood optimization.

Publication:

arXiv e-prints

Pub Date:

May 2017

DOI:

10.48550/arXiv.1705.07420

arXiv:

arXiv:1705.07420

Bibcode:

2017arXiv170507420G

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

Computer Vision and Image Understanding (2019) 102865

NASA/ADS

Large-Scale Classification of Structured Objects using a CRF with Deep Class Embedding

Abstract