An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning
- Richard S. Sutton, Department of Computing Science, University of Alberta
- Ashique Rupam Mahmood
- Martha White, University of Alberta
Citation
R. Sutton, A. Mahmood, M. White. "An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning". Journal of Machine Learning Research (JMLR), (ed: Shie Mannor), 17(73), pp 1-29, January 2016.| Keywords: | Temporal-difference learning, Off-policy learning, Function approximation, Stability, Convergence | 
| Category: | In Journal | 
| Web Links: | JMLR | 
BibTeX
@article{Sutton+al:JMLR16,
  author = {Richard S. Sutton and Ashique Rupam Mahmood and Martha White},
  title = {An Emphatic Approach to the Problem of Off-policy
    Temporal-Difference Learning},
  Editor = {Shie Mannor},
  Volume = "17",
  Number = "73",
  Pages = {1-29},
  journal = {Journal of Machine Learning Research (JMLR)},
  year = 2016,
}Last Updated: March 25, 2020Submitted by Sabina P
 
        