Fundamental Performance Limits for SensorBased Robot Control and Policy Learning
Abstract
Our goal is to develop theory and algorithms for establishing fundamental limits on performance for a given task imposed by a robot's sensors. In order to achieve this, we define a quantity that captures the amount of taskrelevant information provided by a sensor. Using a novel version of the generalized Fano inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for onestep decision making tasks. We then extend this bound to multistep problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freelyfalling object, and (iii) obstacle avoidance using a depth sensor with nonGaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).
 Publication:

arXiv eprints
 Pub Date:
 January 2022
 arXiv:
 arXiv:2202.00129
 Bibcode:
 2022arXiv220200129M
 Keywords:

 Computer Science  Robotics;
 Computer Science  Artificial Intelligence;
 Computer Science  Information Theory;
 Computer Science  Machine Learning;
 Mathematics  Optimization and Control
 EPrint:
 Proceedings of Robotics: Science and Systems (RSS), 2022