TransFlow: Unsupervised Motion Flow by Joint Geometric and Pixel-level Estimation
Abstract
We address unsupervised optical flow estimation for ego-centric motion. We argue that optical flow can be cast as a geometrical warping between two successive video frames and devise a deep architecture to estimate such transformation in two stages. First, a dense pixel-level flow is computed with a geometric prior imposing strong spatial constraints. Such prior is typical of driving scenes, where the point of view is coherent with the vehicle motion. We show how such global transformation can be approximated with an homography and how spatial transformer layers can be employed to compute the flow field implied by such transformation. The second stage then refines the prediction feeding a second deeper network. A final reconstruction loss compares the warping of frame X(t) with the subsequent frame X(t+1) and guides both estimates. The model, which we named TransFlow, performs favorably compared to other unsupervised algorithms, and shows better generalization compared to supervised methods with a 3x reduction in error on unseen data.
- Publication:
-
arXiv e-prints
- Pub Date:
- June 2017
- DOI:
- 10.48550/arXiv.1706.00322
- arXiv:
- arXiv:1706.00322
- Bibcode:
- 2017arXiv170600322A
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- We have found a bug in the flow evaluation code compromising the experimental evaluation and the results provided in the paper are no longer correct. We are currently working on a new experimental campaign but we estimate that results will be available in a few weeks and will drastically change the paper, hence the withdraw request