View Publication

Investigating the Maximum Likelihood Alternative to TD()

Fletcher Lu, School of Computer Science, University of Waterloo
Relu Patrascu, Department of Computer Science, University of Toronto
Dale Schuurmans, AICML

Full Text: investigating-the-maximum-likelihood.pdf

The study of value estimation in Markov re- ward processes has been dominated by re- search on temporal difference methods since the introduction of TD(0) in 1988. Temporal dierence methods are often contrasted with a maximum likelihood approach where the transition matrix and reward vector are es- timated explicitly and converted into a value estimate by solving a matrix equation. It is often asserted that maximum likelihood esti- mation yields more accurate values, but the temporal dierence approach is far more e�- cient computationally. In this paper we show that the rst assertion is true, but the second can be false in many circumstances. In par- ticular, we show that a reasonable implemen- tation of a sparse matrix solver can yield run times for maximum likelihood that are com- petitive with TD(). In our experiments the maximum likelihood estimator yields more accurate values. This higher accuracy in conjunction with competitive execution time suggests that a model based approach might yet be worth pursuing in scaling up reinforce- ment learning.

Citation

F. Lu, R. Patrascu, D. Schuurmans. "Investigating the Maximum Likelihood Alternative to TD()". Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, January 2002.

Keywords:	likelihood, alternative, machine learning
Category:	In Conference

BibTeX

@incollection{Lu+al:NIPS02,
  author = {Fletcher Lu and Relu Patrascu and Dale Schuurmans},
  title = {Investigating the Maximum Likelihood Alternative to TD()},
  booktitle = {Neural Information Processing Systems (NIPS)},
  year = 2002,
}

Last Updated: July 01, 2007
Submitted by Staurt H. Johnson

Not Logged In

PapersDB

Investigating the Maximum Likelihood Alternative to TD()

Citation

BibTeX

Not Logged In

PapersDB

Investigating the Maximum Likelihood Alternative to TD()

Citation

BibTeX

Investigating the Maximum Likelihood Alternative to TD()