DensSiam: End-to-End Densely-Siamese Network with Self-Attention Model for Object Tracking
Abstract
Convolutional Siamese neural networks have been recently used to track objects using deep features. Siamese architecture can achieve real time speed, however it is still difficult to find a Siamese architecture that maintains the generalization capability, high accuracy and speed while decreasing the number of shared parameters especially when it is very deep. Furthermore, a conventional Siamese architecture usually processes one local neighborhood at a time, which makes the appearance model local and non-robust to appearance changes. To overcome these two problems, this paper proposes DensSiam, a novel convolutional Siamese architecture, which uses the concept of dense layers and connects each dense layer to all layers in a feed-forward fashion with a similarity-learning function. DensSiam also includes a Self-Attention mechanism to force the network to pay more attention to the non-local features during offline training. Extensive experiments are performed on four tracking benchmarks: OTB2013 and OTB2015 for validation set; and VOT2015, VOT2016 and VOT2017 for testing set. The obtained results show that DensSiam achieves superior results on these benchmarks compared to other current state-of-the-art methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- September 2018
- DOI:
- 10.48550/arXiv.1809.02714
- arXiv:
- arXiv:1809.02714
- Bibcode:
- 2018arXiv180902714A
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- 11 pages, 3 figures, Accepted by ISVC18