Learning predictive state representations using non-blind policies
- Michael Bowling, University of Alberta
- Peter McCracken, Department of Computing Science, University of Alberta
- Michael James, Toyota Technical Center, Ann Arbor, Michigan, USA
- James Neufeld, Department of Computing Science, University of Alberta
- Dana Wilkinson, School of Computer Science, University of Waterloo
Predictive state representations (PSRs) are powerful models of non-Markovian decision processes that differ from traditional models (e.g., HMMs, POMDPs) by representing state using only observable quantities. Because of this, PSRs can be learned solely using data from interaction with the process. The majority of existing techniques, though, explicitly or implicitly require that this data be gathered using a blind policy, where actions are selected independently of preceding observations. This is a severe limitation for practical learning of PSRs. We present two methods for fixing this limitation in most of the existing PSR algorithms: one when the policy is known and one when it is not. We then present an efficient optimization for computing good exploration policies to be used when learning a PSR. The exploration policies, which are not blind, significantly lower the amount of data needed to build an accurate model, thus demonstrating the importance of non-blind policies.
Citation
M. Bowling, P. McCracken, M. James, J. Neufeld, D. Wilkinson. "Learning predictive state representations using non-blind policies". International Conference on Machine Learning (ICML), Pittsburgh, pp 129-136, January 2006.Keywords: | machine learning |
Category: | In Conference |
BibTeX
@incollection{Bowling+al:ICML06, author = {Michael Bowling and Peter McCracken and Michael James and James Neufeld and Dana Wilkinson}, title = {Learning predictive state representations using non-blind policies}, Pages = {129-136}, booktitle = {International Conference on Machine Learning (ICML)}, year = 2006, }Last Updated: April 24, 2007
Submitted by AICML Admin Assistant