Deep Reinforcement Learning Based Optimal Infinite-Horizon Control of Probabilistic Boolean Control Networks
Abstract
In this paper, a deep reinforcement learning based method is proposed to obtain optimal policies for optimal infinite-horizon control of probabilistic Boolean control networks (PBCNs). Compared with the existing literatures, the proposed method is model-free, namely, the system model and the initial states needn't to be known. Meanwhile, it is suitable for large-scale PBCNs. First, we establish the connection between deep reinforcement learning and optimal infinite-horizon control, and structure the problem into the framework of the Markov decision process. Then, PBCNs are defined as large-scale or small-scale, depending on whether the memory of the action-values exceeds the RAM of the computer. Based on the newly introduced definition, Q-learning (QL) and double deep Q-network (DDQN) are applied to the optimal infinite-horizon control of small-scale and large-scale PBCNs, respectively. Meanwhile, the optimal state feedback controllers are designed. Finally, two examples are presented, which are a small-scale PBCN with 3 nodes, and a large-scale one with 28 nodes. To verify the convergence of QL and DDQN, the optimal control policy and the optimal action-values, which are obtained from both the algorithms, are compared with the ones based on a model-based method named policy iteration. Meanwhile, the performance of QL is compared with DDQN in the small-scale PBCN.
- Publication:
-
arXiv e-prints
- Pub Date:
- April 2023
- DOI:
- arXiv:
- arXiv:2304.03489
- Bibcode:
- 2023arXiv230403489N
- Keywords:
-
- Electrical Engineering and Systems Science - Systems and Control