Learning Correspondence from the Cycle-Consistency of Time
Abstract
We introduce a self-supervised method for learning visual correspondence from unlabeled video. The main idea is to use cycle-consistency in time as free supervisory signal for learning visual representations from scratch. At training time, our model learns a feature map representation to be useful for performing cycle-consistent tracking. At test time, we use the acquired representation to find nearest neighbors across space and time. We demonstrate the generalizability of the representation -- without finetuning -- across a range of visual correspondence tasks, including video object segmentation, keypoint tracking, and optical flow. Our approach outperforms previous self-supervised methods and performs competitively with strongly supervised methods.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2019
- DOI:
- 10.48550/arXiv.1903.07593
- arXiv:
- arXiv:1903.07593
- Bibcode:
- 2019arXiv190307593W
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition;
- Computer Science - Artificial Intelligence;
- Computer Science - Machine Learning
- E-Print:
- CVPR 2019 Oral. Project page: http://ajabri.github.io/timecycle