Contact-conditioned learning of locomotion policies
Abstract
Locomotion is realized through making and breaking contact. State-of-the-art constrained nonlinear model predictive controllers (NMPC) generate whole-body trajectories for a given contact sequence. However, these approaches are computationally expensive at run-time. Hence it is desirable to offload some of this computation to an offline phase. In this paper, we hypothesize that conditioning a learned policy on the locations and timings of contact is a suitable representation for learning a single policy that can generate multiple gaits (contact sequences). In this way, we can build a single generalist policy to realize different gaited and non-gaited locomotion skills and the transitions among them. Our extensive simulation results demonstrate the validity of our hypothesis for learning multiple gaits for a biped robot.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2408.00776
- arXiv:
- arXiv:2408.00776
- Bibcode:
- 2024arXiv240800776C
- Keywords:
-
- Computer Science - Robotics