Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection

doi:10.48550/arXiv.2007.12104

Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection

Few-shot object detection aims at detecting objects with few annotated examples, which remains a challenging research problem yet to be explored. Recent studies have shown the effectiveness of self-learned top-down attention mechanisms in object detection and other vision tasks. The top-down attention, however, is less effective at improving the performance of few-shot detectors. Due to the insufficient training data, object detectors cannot effectively generate attention maps for few-shot examples. To improve the performance and interpretability of few-shot object detectors, we propose an attentive few-shot object detection network (AttFDNet) that takes the advantages of both top-down and bottom-up attention. Being task-agnostic, the bottom-up attention serves as a prior that helps detect and localize naturally salient objects. We further address specific challenges in few-shot object detection by introducing two novel loss terms and a hybrid few-shot learning strategy. Experimental results and visualization demonstrate the complementary nature of the two types of attention and their roles in few-shot object detection. Codes are available at https://github.com/chenxy99/AttFDNet.

Publication:

arXiv e-prints

Pub Date:

July 2020

DOI:

10.48550/arXiv.2007.12104

arXiv:

arXiv:2007.12104

Bibcode:

2020arXiv200712104C

Keywords:

Computer Science - Computer Vision and Pattern Recognition

E-Print:

This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

NASA/ADS

Leveraging Bottom-Up and Top-Down Attention for Few-Shot Object Detection

Abstract