Solving Two-Player General-Sum Games Between Swarms
Abstract
Hamilton-Jacobi-Isaacs (HJI) PDEs are the governing equations for the two-player general-sum games. Unlike Reinforcement Learning (RL) methods, which are data-intensive methods for learning value function, learning HJ PDEs provide a guaranteed convergence to the Nash Equilibrium value of the game when it exists. However, a caveat is that solving HJ PDEs becomes intractable when the state dimension increases. To circumvent the curse of dimensionality (CoD), physics-informed machine learning methods with supervision can be used and have been shown to be effective in generating equilibrial policies in two-player general-sum games. In this work, we extend the existing work on agent-level two-player games to a two-player swarm-level game, where two sub-swarms play a general-sum game. We consider the \textit{Kolmogorov forward equation} as the dynamic model for the evolution of the densities of the swarms. Results show that policies generated from the physics-informed neural network (PINN) result in a higher payoff than a Nash Double Deep Q-Network (Nash DDQN) agent and have comparable performance with numerical solvers.
- Publication:
-
arXiv e-prints
- Pub Date:
- October 2023
- DOI:
- 10.48550/arXiv.2310.01682
- arXiv:
- arXiv:2310.01682
- Bibcode:
- 2023arXiv231001682G
- Keywords:
-
- Computer Science - Multiagent Systems;
- Computer Science - Computer Science and Game Theory;
- Computer Science - Robotics
- E-Print:
- Submitted to ACC 2024. Revised Version, fixed typo in algorithm (DQN instead of DDQN)