Weakly Supervised Semantic Segmentation by Knowledge Graph Inference
Abstract
Currently, existing efforts in Weakly Supervised Semantic Segmentation (WSSS) based on Convolutional Neural Networks (CNNs) have predominantly focused on enhancing the multi-label classification network stage, with limited attention given to the equally important downstream segmentation network. Furthermore, CNN-based local convolutions lack the ability to model the extensive inter-category dependencies. Therefore, this paper introduces a graph reasoning-based approach to enhance WSSS. The aim is to improve WSSS holistically by simultaneously enhancing both the multi-label classification and segmentation network stages. In the multi-label classification network segment, external knowledge is integrated, coupled with GCNs, to globally reason about inter-class dependencies. This encourages the network to uncover features in non-salient regions of images, thereby refining the completeness of generated pseudo-labels. In the segmentation network segment, the proposed Graph Reasoning Mapping (GRM) module is employed to leverage knowledge obtained from textual databases, facilitating contextual reasoning for class representation within image regions. This GRM module enhances feature representation in high-level semantics of the segmentation network's local convolutions, while dynamically learning semantic coherence for individual samples. Using solely image-level supervision, we have achieved state-of-the-art performance in WSSS on the PASCAL VOC 2012 and MS-COCO datasets. Extensive experimentation on both the multi-label classification and segmentation network stages underscores the effectiveness of the proposed graph reasoning approach for advancing WSSS.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2023
- DOI:
- 10.48550/arXiv.2309.14057
- arXiv:
- arXiv:2309.14057
- Bibcode:
- 2023arXiv230914057Z
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Our description in Chapter 3, Section 3.2 of the paper is too repetitive with the paper "Object detection meets knowledge graphs". There is an error in the description of formula (5) in Section 3.3. And a detailed reasoning process is required for formula (5). Therefore, we wish to request a retraction of the paper