Instance-incremental Scene Graph Generation from Real-world Point Clouds via Normalizing Flows
Abstract
This work introduces a new task of instance-incremental scene graph generation: Given a scene of the point cloud, representing it as a graph and automatically increasing novel instances. A graph denoting the object layout of the scene is finally generated. It is an important task since it helps to guide the insertion of novel 3D objects into a real-world scene in vision-based applications like augmented reality. It is also challenging because the complexity of the real-world point cloud brings difficulties in learning object layout experiences from the observation data (non-empty rooms with labeled semantics). We model this task as a conditional generation problem and propose a 3D autoregressive framework based on normalizing flows (3D-ANF) to address it. First, we represent the point cloud as a graph by extracting the label semantics and contextual relationships. Next, a model based on normalizing flows is introduced to map the conditional generation of graphic elements into the Gaussian process. The mapping is invertible. Thus, the real-world experiences represented in the observation data can be modeled in the training phase, and novel instances can be autoregressively generated based on the Gaussian process in the testing phase. To evaluate the performance of our method sufficiently, we implement this new task on the indoor benchmark dataset 3DSSG-O27R16 and our newly proposed graphical dataset of outdoor scenes GPL3D. Experiments show that our method generates reliable novel graphs from the real-world point cloud and achieves state-of-the-art performance on the datasets.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2023
- DOI:
- 10.48550/arXiv.2302.10425
- arXiv:
- arXiv:2302.10425
- Bibcode:
- 2023arXiv230210425Q
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Accepted by IEEE TCSVT. The supplementary material is available in the media column of the journal version of the article