Convergence Problems of General-Sum Multiagent Reinforcement Learning
- Michael Bowling, University of Alberta
Stochastic games are a generalization of MDPs to multiple agents, and can be used as a framework for investigating multiagent learning. Hu and Wellman (1998) recently proposed a multiagent Q-learning method for general-sum stochastic games. In addition to describing the algorithm, they provide a proof that the method will converge to a Nash equilibrium for the game under specified conditions. The convergence depends on a lemma stating that the iteration used by this method is a contraction mapping. Unfortunately the proof is incomplete. In this paper we present a counterexample and flaw to the lemma’s proof. We also introduce strengthened assumptions under which the lemma holds, and examine how this affects the classes of games to which the theoretical result can be applied.
Citation
M. Bowling. "Convergence Problems of General-Sum Multiagent Reinforcement Learning". International Conference on Machine Learning (ICML), Stanford University, pp 89-94, June 2000.Keywords: | |
Category: | In Conference |
BibTeX
@incollection{Bowling:ICML00, author = {Michael Bowling}, title = {Convergence Problems of General-Sum Multiagent Reinforcement Learning}, Pages = {89-94}, booktitle = {International Conference on Machine Learning (ICML)}, year = 2000, }Last Updated: April 24, 2007
Submitted by AICML Admin Assistant