Quantum Markov Decision Processes: General Theory, Approximations, and Classes of Policies
Abstract
In this paper, the aim is to develop a quantum counterpart to classical Markov decision processes (MDPs). Firstly, we provide a very general formulation of quantum MDPs with state and action spaces in the quantum domain, quantum transitions, and cost functions. Once we formulate the quantum MDP (q-MDP), our focus shifts to establishing the verification theorem that proves the sufficiency of Markovian quantum control policies and provides a dynamic programming principle. Subsequently, a comparison is drawn between our q-MDP model and previously established quantum MDP models (referred to as QOMDPs) found in the literature. Furthermore, approximations of q-MDPs are obtained via finite-action models, which can be formulated as QOMDPs. Finally, classes of open-loop and classical-state-preserving closed-loop policies for q-MDPs are introduced, along with structural results for these policies. In summary, we present a novel quantum MDP model aiming to introduce a new framework, algorithms, and future research avenues. We hope that our approach will pave the way for a new research direction in discrete-time quantum control.
- Publication:
-
arXiv e-prints
- Pub Date:
- February 2024
- DOI:
- 10.48550/arXiv.2402.14649
- arXiv:
- arXiv:2402.14649
- Bibcode:
- 2024arXiv240214649S
- Keywords:
-
- Quantum Physics;
- Electrical Engineering and Systems Science - Systems and Control;
- Mathematics - Optimization and Control
- E-Print:
- 30 pages