Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

doi:10.48550/arXiv.1904.01068

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models. Our algorithm guarantees safety by leveraging Lipschitz-continuity to ensure that no unsafe states are visited during exploration. Unlike many other existing techniques, the provided safety guarantee is deterministic. Our algorithm is optimized to reduce the number of actions needed for exploring the safe space. We demonstrate the performance of our algorithm in comparison with baseline methods in simulation on navigation tasks.

Publication:

arXiv e-prints

Pub Date:

April 2019

DOI:

10.48550/arXiv.1904.01068

arXiv:

arXiv:1904.01068

Bibcode:

2019arXiv190401068B

Keywords:

Computer Science - Robotics;
Computer Science - Artificial Intelligence;
Computer Science - Machine Learning;
Electrical Engineering and Systems Science - Systems and Control

E-Print:

Proceedings of the American Control Conference (ACC), July 2019. The first two authors have equal contribution

NASA/ADS

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

Abstract