Convergence of the standard RLS method and UDUT factorisation of covariance matrix for solving the algebraic Riccati equation of the DLQR via heuristic approximate dynamic programming
The main focus of this article is to present a proposal to solve, via UDUT factorisation, the convergence and numerical stability problems that are related to the covariance matrix ill-conditioning of the recursive least squares (RLS) approach for online approximations of the algebraic Riccati equation (ARE) solution associated with the discrete linear quadratic regulator (DLQR) problem formulated in the actor-critic reinforcement learning and approximate dynamic programming context. The parameterisations of the Bellman equation, utility function and dynamic system as well as the algebra of Kronecker product assemble a framework for the solution of the DLQR problem. The condition number and the positivity parameter of the covariance matrix are associated with statistical metrics for evaluating the approximation performance of the ARE solution via RLS-based estimators. The performance of RLS approximators is also evaluated in terms of consistence and polarisation when associated with reinforcement learning methods. The used methodology contemplates realisations of online designs for DLQR controllers that is evaluated in a multivariable dynamic system model.