Robust Maximum Entropy Behavior Cloning

doi:10.48550/arXiv.2101.01251

Robust Maximum Entropy Behavior Cloning

Imitation learning (IL) algorithms use expert demonstrations to learn a specific task. Most of the existing approaches assume that all expert demonstrations are reliable and trustworthy, but what if there exist some adversarial demonstrations among the given data-set? This may result in poor decision-making performance. We propose a novel general frame-work to directly generate a policy from demonstrations that autonomously detect the adversarial demonstrations and exclude them from the data set. At the same time, it's sample, time-efficient, and does not require a simulator. To model such adversarial demonstration we propose a min-max problem that leverages the entropy of the model to assign weights for each demonstration. This allows us to learn the behavior using only the correct demonstrations or a mixture of correct demonstrations.

Publication:

arXiv e-prints

Pub Date:

January 2021

DOI:

10.48550/arXiv.2101.01251

arXiv:

arXiv:2101.01251

Bibcode:

2021arXiv210101251H

Keywords:

Computer Science - Machine Learning

E-Print:

NeurIPS 2020 3rd Robot Learning Workshop: Grounding Machine Learning Development in the Real World

NASA/ADS

Robust Maximum Entropy Behavior Cloning

Abstract