Not Logged In

Improving Exploration in UCT Using Local Manifolds

Monte Carlo planning has been proven successful in many sequential decision-making settings, but it suffers from poor exploration when the rewards are sparse. In this paper, we improve exploration in UCT by generalizing across similar states using a given distance metric. When the state space does not have a natural distance metric, we show how we can learn a local manifold from the transition graph of states in the near future. to obtain a distance metric. On domains inspired by video games, empirical evidence shows that our algorithm is more sample efficient than UCT, particularly when rewards are sparse.

Citation

S. Srinivasan, E. Talvitie, M. Bowling. "Improving Exploration in UCT Using Local Manifolds". National Conference on Artificial Intelligence (AAAI), (ed: Blai Bonet, Sven Koenig), pp 3386-3392, January 2015.

Keywords:  
Category: In Conference
Web Links: AAAI

BibTeX

@incollection{Srinivasan+al:AAAI15,
  author = {Sriram Srinivasan and Erik Talvitie and Michael Bowling},
  title = {Improving Exploration in UCT Using Local Manifolds},
  Editor = {Blai Bonet, Sven Koenig},
  Pages = {3386-3392},
  booktitle = {National Conference on Artificial Intelligence (AAAI)},
  year = 2015,
}

Last Updated: October 29, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo