When SAM2 Meets Video Shadow and Mirror Detection
Abstract
As the successor to the Segment Anything Model (SAM), the Segment Anything Model 2 (SAM2) not only improves performance in image segmentation but also extends its capabilities to video segmentation. However, its effectiveness in segmenting rare objects that seldom appear in videos remains underexplored. In this study, we evaluate SAM2 on three distinct video segmentation tasks: Video Shadow Detection (VSD) and Video Mirror Detection (VMD). Specifically, we use ground truth point or mask prompts to initialize the first frame and then predict corresponding masks for subsequent frames. Experimental results show that SAM2's performance on these tasks is suboptimal, especially when point prompts are used, both quantitatively and qualitatively. Code is available at \url{https://github.com/LeipingJie/SAM2Video}
- Publication:
-
arXiv e-prints
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2412.19293
- Bibcode:
- 2024arXiv241219293J
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- Technical Report