Efficient Replay Memory Architectures in Multi-Agent Reinforcement Learning for Traffic Congestion Control
Abstract
Episodic control, inspired by the role of episodic memory in the human brain, has been shown to improve the sample inefficiency of model-free reinforcement learning by reusing high-return past experiences. However, the memory growth of episodic control is undesirable in large-scale multi-agent problems such as vehicle traffic management. This paper proposes a novel replay memory architecture called Dual-Memory Integrated Learning, to augment to multi-agent reinforcement learning methods for congestion control via adaptive light signal scheduling. Our dual-memory architecture mimics two core capabilities of human decision-making. First, it relies on diverse types of memory--semantic and episodic, short-term and long-term--in order to remember high-return states that occur often in the network and filter out states that don't. Second, it employs equivalence classes to group together similar state-action pairs and that can be controlled using the same action (i.e., light signal sequence). Theoretical analyses establish memory growth bounds, and simulation experiments on several intersection networks showcase improved congestion performance (e.g., vehicle throughput) from our method.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.16034
- arXiv:
- arXiv:2407.16034
- Bibcode:
- 2024arXiv240716034C
- Keywords:
-
- Electrical Engineering and Systems Science - Systems and Control
- E-Print:
- Full version of accepted paper to IEEE Intelligent Transportation Systems Conference (ITSC) 2024