Contact-conditioned learning of locomotion policies

doi:10.48550/arXiv.2408.00776

Contact-conditioned learning of locomotion policies

Locomotion is realized through making and breaking contact. State-of-the-art constrained nonlinear model predictive controllers (NMPC) generate whole-body trajectories for a given contact sequence. However, these approaches are computationally expensive at run-time. Hence it is desirable to offload some of this computation to an offline phase. In this paper, we hypothesize that conditioning a learned policy on the locations and timings of contact is a suitable representation for learning a single policy that can generate multiple gaits (contact sequences). In this way, we can build a single generalist policy to realize different gaited and non-gaited locomotion skills and the transitions among them. Our extensive simulation results demonstrate the validity of our hypothesis for learning multiple gaits for a biped robot.

Publication:

arXiv e-prints

Pub Date:

July 2024

DOI:

10.48550/arXiv.2408.00776

arXiv:

arXiv:2408.00776

Bibcode:

2024arXiv240800776C

Keywords:

Computer Science - Robotics

NASA/ADS

Contact-conditioned learning of locomotion policies

Abstract