Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

doi:10.48550/arXiv.2311.03701

Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

Meta Reinforcement Learning (Meta RL) trains agents that adapt to fast-changing environments and tasks. Current strategies often lose adaption efficiency due to the passive nature of model exploration, causing delayed understanding of new transition dynamics. This results in particularly fast-evolving tasks being impossible to solve. We propose a novel approach, Hypothesis Network Planned Exploration (HyPE), that integrates an active and planned exploration process via the hypothesis network to optimize adaptation speed. HyPE uses a generative hypothesis network to form potential models of state transition dynamics, then eliminates incorrect models through strategically devised experiments. Evaluated on a symbolic version of the Alchemy game, HyPE outpaces baseline methods in adaptation speed and model accuracy, validating its potential in enhancing reinforcement learning adaptation in rapidly evolving settings.

Publication:

arXiv e-prints

Pub Date:

November 2023

DOI:

10.48550/arXiv.2311.03701

arXiv:

arXiv:2311.03701

Bibcode:

2023arXiv231103701J

Keywords:

Computer Science - Artificial Intelligence;
Computer Science - Machine Learning

NASA/ADS

Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

Abstract