Self-Improving SLAM in Dynamic Environments: Learning When to Mask

doi:10.48550/arXiv.2210.08350

Self-Improving SLAM in Dynamic Environments: Learning When to Mask

Visual SLAM - Simultaneous Localization and Mapping - in dynamic environments typically relies on identifying and masking image features on moving objects to prevent them from negatively affecting performance. Current approaches are suboptimal: they either fail to mask objects when needed or, on the contrary, mask objects needlessly. Thus, we propose a novel SLAM that learns when masking objects improves its performance in dynamic scenarios. Given a method to segment objects and a SLAM, we give the latter the ability of Temporal Masking, i.e., to infer when certain classes of objects should be masked to maximize any given SLAM metric. We do not make any priors on motion: our method learns to mask moving objects by itself. To prevent high annotations costs, we created an automatic annotation method for self-supervised training. We constructed a new dataset, named ConsInv, which includes challenging real-world dynamic sequences respectively indoors and outdoors. Our method reaches the state of the art on the TUM RGB-D dataset and outperforms it on KITTI and ConsInv datasets.

Publication:

arXiv e-prints

Pub Date:

October 2022

DOI:

10.48550/arXiv.2210.08350

arXiv:

arXiv:2210.08350

Bibcode:

2022arXiv221008350B

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Artificial Intelligence

E-Print:

Accepted to BMVC 2022. Dataset link: https://github.com/adrianbojko/consinv-dataset

NASA/ADS

Self-Improving SLAM in Dynamic Environments: Learning When to Mask

Abstract