AIMS: All-Inclusive Multi-Level Segmentation

doi:10.48550/arXiv.2305.17768

AIMS: All-Inclusive Multi-Level Segmentation

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved. In this paper, we propose a new task, All-Inclusive Multi-Level Segmentation (AIMS), which segments visual regions into three levels: part, entity, and relation (two entities with some semantic relationships). We also build a unified AIMS model through multi-dataset multi-task training to address the two major challenges of annotation inconsistency and task correlation. Specifically, we propose task complementarity, association, and prompt mask encoder for three-level predictions. Extensive experiments demonstrate the effectiveness and generalization capacity of our method compared to other state-of-the-art methods on a single dataset or the concurrent work on segmenting anything. We will make our code and training model publicly available.

Publication:

arXiv e-prints

Pub Date:

May 2023

DOI:

10.48550/arXiv.2305.17768

arXiv:

arXiv:2305.17768

Bibcode:

2023arXiv230517768Q

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

Technical Report

ADS

AIMS: All-Inclusive Multi-Level Segmentation

Abstract