Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory
- Andras Antos, Computer and Automation Reasearch Inst.
- Csaba Szepesvari, Department of Computing Science; PI of AICML
- Remi Munos
Citation
A. Antos, C. Szepesvari, R. Munos. "Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory". Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp 330-337, April 2007.| Keywords: | machine learning |
| Category: | In Conference |
BibTeX
@incollection{Antos+al:ADPRL07,
author = {Andras Antos and Csaba Szepesvari and Remi Munos},
title = {Value-Iteration Based Fitted Policy Iteration: Learning with a
Single Trajectory},
Pages = {330-337},
booktitle = {Symposium on Approximate Dynamic Programming and Reinforcement
Learning},
year = 2007,
}Last Updated: August 20, 2007Submitted by Valerie Dacyk