Pixels to Graphs by Associative Embedding

doi:10.48550/arXiv.1706.07365

Pixels to Graphs by Associative Embedding

Graphs are a useful abstraction of image content. Not only can graphs represent details about individual objects in a scene but they can capture the interactions between pairs of objects. We present a method for training a convolutional neural network such that it takes in an input image and produces a full graph definition. This is done end-to-end in a single stage with the use of associative embeddings. The network learns to simultaneously identify all of the elements that make up a graph and piece them together. We benchmark on the Visual Genome dataset, and demonstrate state-of-the-art performance on the challenging task of scene graph generation.

Publication:

arXiv e-prints

Pub Date:

June 2017

DOI:

10.48550/arXiv.1706.07365

arXiv:

arXiv:1706.07365

Bibcode:

2017arXiv170607365N

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Machine Learning

E-Print:

Updated numbers. Code and pretrained models available at https://github.com/umich-vl/px2graph

ADS

Pixels to Graphs by Associative Embedding

Abstract