Autonomous exploration for navigating in non-stationary CMPs

doi:10.48550/arXiv.1910.08446

Autonomous exploration for navigating in non-stationary CMPs

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which counts the time steps at which the learner lacks sufficient knowledge to navigate its environment efficiently. We devise a learning meta-algorithm, MNM and prove an upper bound on the exploration steps in terms of the number of changes.

Publication:

arXiv e-prints

Pub Date:

October 2019

DOI:

10.48550/arXiv.1910.08446

arXiv:

arXiv:1910.08446

Bibcode:

2019arXiv191008446G

Keywords:

Computer Science - Machine Learning;
Statistics - Machine Learning

NASA/ADS

Autonomous exploration for navigating in non-stationary CMPs

Abstract