Author
Guo, X
Hu, A
XU, R
Zhang, J
Last updated
2020-05-22T14:45:32.81+01:00
Abstract
This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision-making in stochastic games with a large population. It first establishes the existence of a unique Nash Equilibrium to this GMFG, and demonstrates that naively combining Qlearning with the fixed-point approach in classical MFGs yields unstable algorithms. It then
proposes value-based and policy-based reinforcement learning algorithms (GMF-P and GMF-P respectively) with smoothed policies, with analysis of convergence property and computational complexity. The experiments on repeated Ad auction problems demonstrate that GMF-V-Q, a specific GMF-V algorithm based on Q-learning, is efficient and robust in terms of convergence and learning accuracy. Moreover, its performance is superior in convergence, stability, and learning ability, when compared with existing algorithms for multi-agent reinforcement learning.
Symplectic ID
1106098
Download URL
https://renyuanxu.github.io/
Favourite
Off
Publication type
59
Please contact us with feedback and comments about this page. Created on 22 May 2020 - 17:30.