MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles

doi:10.48550/arXiv.2106.08634

MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles

In this work, we propose a Model Predictive Control (MPC)-based Reinforcement Learning (RL) method for Autonomous Surface Vehicles (ASVs). The objective is to find an optimal policy that minimizes the closed-loop performance of a simplified freight mission, including collision-free path following, autonomous docking, and a skillful transition between them. We use a parametrized MPC-scheme to approximate the optimal policy, which considers path-following/docking costs and states (position, velocity)/inputs (thruster force, angle) constraints. The Least Squares Temporal Difference (LSTD)-based Deterministic Policy Gradient (DPG) method is then applied to update the policy parameters. Our simulation results demonstrate that the proposed MPC-LSTD-based DPG method could improve the closed-loop performance during learning for the freight mission problem of ASV.

Publication:

arXiv e-prints

Pub Date:

June 2021

DOI:

10.48550/arXiv.2106.08634

arXiv:

arXiv:2106.08634

Bibcode:

2021arXiv210608634C

Keywords:

Electrical Engineering and Systems Science - Systems and Control

E-Print:

6 pages, 7 figures, this paper has been accepted to be presented at 2021 60th IEEE Conference on Decision and Control (CDC)

NASA/ADS

MPC-based Reinforcement Learning for a Simplified Freight Mission of Autonomous Surface Vehicles

Abstract