Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

doi:10.48550/arXiv.2501.01727

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Rens, Gavin B.

Humanoid robots must master numerous tasks with sparse rewards, posing a challenge for reinforcement learning (RL). We propose a method combining RL and automated planning to address this. Our approach uses short goal-conditioned policies (GCPs) organized hierarchically, with Monte Carlo Tree Search (MCTS) planning using high-level actions (HLAs). Instead of primitive actions, the planning process generates HLAs. A single plan-tree, maintained during the agent's lifetime, holds knowledge about goal achievement. This hierarchy enhances sample efficiency and speeds up reasoning by reusing HLAs and anticipating future actions. Our Hierarchical Goal-Conditioned Policy Planning (HGCPP) framework uniquely integrates GCPs, MCTS, and hierarchical RL, potentially improving exploration and planning in complex tasks.

Publication:

arXiv e-prints

Pub Date:

January 2025

DOI:

10.48550/arXiv.2501.01727

arXiv:

arXiv:2501.01727

Bibcode:

2025arXiv250101727R

Keywords:

Computer Science - Artificial Intelligence;
Computer Science - Machine Learning

E-Print:

10 pages, 4 figures, this is a preprint of the peer-reviewed version published by SCITEPRESS for ICAART-2025

ADS

Proposing Hierarchical Goal-Conditioned Policy Planning in Multi-Goal Reinforcement Learning

Abstract