Segment to Recognize Robustly -- Enhancing Recognition by Image Decomposition
Abstract
In image recognition, both foreground (FG) and background (BG) play an important role; however, standard deep image recognition often leads to unintended over-reliance on the BG, limiting model robustness in real-world deployment settings. Current solutions mainly suppress the BG, sacrificing BG information for improved generalization. We propose "Segment to Recognize Robustly" (S2R^2), a novel recognition approach which decouples the FG and BG modelling and combines them in a simple, robust, and interpretable manner. S2R^2 leverages recent advances in zero-shot segmentation to isolate the FG and the BG before or during recognition. By combining FG and BG, potentially also with a standard full-image classifier, S2R^2 achieves state-of-the-art results on in-domain data while maintaining robustness to BG shifts. The results confirm that segmentation before recognition is now possible.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2024
- arXiv:
- arXiv:2411.15933
- Bibcode:
- 2024arXiv241115933J
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition