Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-free LQR

doi:10.48550/arXiv.2401.14534

Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-free LQR

We investigate the problem of learning linear quadratic regulators (LQR) in a multi-task, heterogeneous, and model-free setting. We characterize the stability and personalization guarantees of a policy gradient-based (PG) model-agnostic meta-learning (MAML) (Finn et al., 2017) approach for the LQR problem under different task-heterogeneity settings. We show that our MAML-LQR algorithm produces a stabilizing controller close to each task-specific optimal controller up to a task-heterogeneity bias in both model-based and model-free learning scenarios. Moreover, in the model-based setting, we show that such a controller is achieved with a linear convergence rate, which improves upon sub-linear rates from existing work. Our theoretical guarantees demonstrate that the learned controller can efficiently adapt to unseen LQR tasks.

Publication:

arXiv e-prints

Pub Date:

January 2024

DOI:

10.48550/arXiv.2401.14534

arXiv:

arXiv:2401.14534

Bibcode:

2024arXiv240114534T

Keywords:

Mathematics - Optimization and Control;
Computer Science - Machine Learning

NASA/ADS

Meta-Learning Linear Quadratic Regulators: A Policy Gradient MAML Approach for Model-free LQR

Abstract