Not Logged In

Scalable learning in stochastic games

Full Text: 02aaai_ws_gtdt.pdf PDF

Stochastic games are a general model of interaction between multiple agents. They have recently been the focus of a great deal of research in reinforcement learning as they are both descriptive and have a well-defined Nash equilibrium solution. Most of this recent work, although very general, has only been applied to small games with at most hundreds of states. On the other hand, there are landmark results of learning being successfully applied to specific large and complex games such as Checkers and Backgammon. In this paper we describe a scalable learning algorithm for stochastic games, that combines three separate ideas from reinforcement learning into a single algorithm. These ideas are tile coding for generalization, policy gradient ascent as the basic learning method, and our previous work on the WoLF (“Win or Learn Fast”) variable learning rate to encourage convergence. We apply this algorithm to the intractably sized game-theoretic card game Goofspiel, showing preliminary results of learning in self-play. We demonstrate that policy gradient ascent can learn even in this highly non-stationary problem with simultaneous learning. We also show that the WoLF principle continues to have a converging effect even in large problems with approximation and generalization.

Citation

M. Bowling, M. Veloso. "Scalable learning in stochastic games". Game Theoretic and Decision Theoretic Agents, July 2002.

Keywords: machine learning
Category: In Workshop

BibTeX

@misc{Bowling+Veloso:GameTheoreticandDecisionTheoreticAgents02,
  author = {Michael Bowling and Manuela Veloso},
  title = {Scalable learning in stochastic games},
  booktitle = {Game Theoretic and Decision Theoretic Agents},
  year = 2002,
}

Last Updated: January 04, 2007
Submitted by William Thorne

University of Alberta Logo AICML Logo