Policy Gradient Methods for Reinforcement Learning With Function Approximation
- Richard S. Sutton, Department of Computing Science, University of Alberta
- David McAllester, AT&T Labs-Research, Florham Park, New Jersey
- Satinder Singh, University of Michigan, Ann Arbor, MI
- Yishay Mansour, AT&T Labs-Research, Florham Park, New Jersey
Citation
R. Sutton, D. McAllester, S. Singh, Y. Mansour. "Policy Gradient Methods for Reinforcement Learning With Function Approximation". Neural Information Processing Systems (NIPS), Denver, CO, USA, pp 1057-1063, January 1999.Keywords: | advantage, alternative, reinforce, differentiable, machine learning |
Category: | In Conference |
BibTeX
@incollection{Sutton+al:NIPS99, author = {Richard S. Sutton and David McAllester and Satinder Singh and Yishay Mansour}, title = {Policy Gradient Methods for Reinforcement Learning With Function Approximation}, Pages = {1057-1063}, booktitle = {Neural Information Processing Systems (NIPS)}, year = 1999, }Last Updated: May 31, 2007
Submitted by Staurt H. Johnson