Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory
- Andras Antos, Computer and Automation Reasearch Inst.
- Csaba Szepesvari, Department of Computing Science; PI of AICML
- Remi Munos
Citation
A. Antos, C. Szepesvari, R. Munos. "Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory". Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp 330-337, April 2007.Keywords: | machine learning |
Category: | In Conference |
BibTeX
@incollection{Antos+al:ADPRL07, author = {Andras Antos and Csaba Szepesvari and Remi Munos}, title = {Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory}, Pages = {330-337}, booktitle = {Symposium on Approximate Dynamic Programming and Reinforcement Learning}, year = 2007, }Last Updated: August 20, 2007
Submitted by Valerie Dacyk