ReLExS: Reinforcement Learning Explanations for Stackelberg No-Regret Learners

doi:10.48550/arXiv.2408.14086

ReLExS: Reinforcement Learning Explanations for Stackelberg No-Regret Learners

With the constraint of a no regret follower, will the players in a two-player Stackelberg game still reach Stackelberg equilibrium? We first show when the follower strategy is either reward-average or transform-reward-average, the two players can always get the Stackelberg Equilibrium. Then, we extend that the players can achieve the Stackelberg equilibrium in the two-player game under the no regret constraint. Also, we show a strict upper bound of the follower's utility difference between with and without no regret constraint. Moreover, in constant-sum two-player Stackelberg games with non-regret action sequences, we ensure the total optimal utility of the game remains also bounded.

Publication:

arXiv e-prints

Pub Date:

August 2024

DOI:

10.48550/arXiv.2408.14086

arXiv:

arXiv:2408.14086

Bibcode:

2024arXiv240814086H

Keywords:

Computer Science - Computer Science and Game Theory;
Computer Science - Machine Learning

E-Print:

10 pages, 3 figures. Technical Report

NASA/ADS

ReLExS: Reinforcement Learning Explanations for Stackelberg No-Regret Learners

Abstract