View Publication

Learning predictive state representations using non-blind policies

Michael Bowling, University of Alberta
Peter McCracken, Department of Computing Science, University of Alberta
Michael James, Toyota Technical Center, Ann Arbor, Michigan, USA
James Neufeld, Department of Computing Science, University of Alberta
Dana Wilkinson, School of Computer Science, University of Waterloo

Predictive state representations (PSRs) are powerful models of non-Markovian decision processes that differ from traditional models (e.g., HMMs, POMDPs) by representing state using only observable quantities. Because of this, PSRs can be learned solely using data from interaction with the process. The majority of existing techniques, though, explicitly or implicitly require that this data be gathered using a blind policy, where actions are selected independently of preceding observations. This is a severe limitation for practical learning of PSRs. We present two methods for fixing this limitation in most of the existing PSR algorithms: one when the policy is known and one when it is not. We then present an efficient optimization for computing good exploration policies to be used when learning a PSR. The exploration policies, which are not blind, significantly lower the amount of data needed to build an accurate model, thus demonstrating the importance of non-blind policies.

Citation

M. Bowling, P. McCracken, M. James, J. Neufeld, D. Wilkinson. "Learning predictive state representations using non-blind policies". International Conference on Machine Learning (ICML), Pittsburgh, pp 129-136, January 2006.

Keywords:	machine learning
Category:	In Conference

BibTeX

@incollection{Bowling+al:ICML06,
  author = {Michael Bowling and Peter McCracken and Michael James and James
    Neufeld and Dana Wilkinson},
  title = {Learning predictive state representations using non-blind policies},
  Pages = {129-136},
  booktitle = {International Conference on Machine Learning (ICML)},
  year = 2006,
}

Last Updated: April 24, 2007
Submitted by AICML Admin Assistant

Not Logged In

PapersDB

Learning predictive state representations using non-blind policies

Citation

BibTeX