Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

doi:10.48550/arXiv.2006.08105

Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

In this work, we show existence of invariant ergodic measure for switched linear dynamical systems (SLDSs) under a norm-stability assumption of system dynamics in some unbounded subset of $\mathbb{R}^{n}$. Consequently, given a stationary Markov control policy, we derive non-asymptotic bounds for learning expected reward (w.r.t the invariant ergodic measure our closed-loop system mixes to) from time-averages using Birkhoff's Ergodic Theorem. The presented results provide a foundation for deriving non-asymptotic analysis for average reward-based optimal control of SLDSs. Finally, we illustrate the presented theoretical results in two case-studies.

Publication:

arXiv e-prints

Pub Date:

June 2020

DOI:

10.48550/arXiv.2006.08105

arXiv:

arXiv:2006.08105

Bibcode:

2020arXiv200608105A

Keywords:

Mathematics - Probability;
Computer Science - Machine Learning;
Electrical Engineering and Systems Science - Systems and Control

NASA/ADS

Learning Expected Reward for Switched Linear Control Systems: A Non-Asymptotic View

Abstract