Synthetic-to-Real Domain Adaptation using Contrastive Unpaired Translation
Abstract
The usefulness of deep learning models in robotics is largely dependent on the availability of training data. Manual annotation of training data is often infeasible. Synthetic data is a viable alternative, but suffers from domain gap. We propose a multi-step method to obtain training data without manual annotation effort: From 3D object meshes, we generate images using a modern synthesis pipeline. We utilize a state-of-the-art image-to-image translation method to adapt the synthetic images to the real domain, minimizing the domain gap in a learned manner. The translation network is trained from unpaired images, i.e. just requires an un-annotated collection of real images. The generated and refined images can then be used to train deep learning models for a particular task. We also propose and evaluate extensions to the translation method that further increase performance, such as patch-based training, which shortens training time and increases global consistency. We evaluate our method and demonstrate its effectiveness on two robotic datasets. We finally give insight into the learned refinement operations.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2022
- DOI:
- 10.48550/arXiv.2203.09454
- arXiv:
- arXiv:2203.09454
- Bibcode:
- 2022arXiv220309454I
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), 2022, pp. 595-602