Acquiring a broad range of empirical knowledge in real time by temporal-difference learning
- Joseph Modayil
- Adam White
- Patrick M. Pilarski
- Richard S. Sutton, Department of Computing Science, University of Alberta
Several robot capabilities rely on predictions about the temporally extended consequences of a robot's behaviour. We describe how a robot can both learn and make many such predictions in real time using a standard algorithm. Our experiments show that a mobile robot can learn and make thousands of accurate predictions at 10 Hz. The predictions are about the future of all of the robot's sensors and many internal state variables at multiple time-scales. All the predictions share a single set of features and learning parameters. We demonstrate the generality of this method with an application to a different platform, a robot arm operating at 50 Hz. Here, learned predictions can be used to measurably improve the user interface. The temporally extended predictions learned in real time by this method constitute a basic form of knowledge about the dynamics of the robot's interaction with the environment. We also show how this method can be extended to express more general forms of knowledge.
Citation
J. Modayil, A. White, P. Pilarski, R. Sutton. "Acquiring a broad range of empirical knowledge in real time by temporal-difference learning". International Conference on Systems, Man, and Cybernetics (SMC), Seoul, South Korea, pp 1903-1910, October 2012.Keywords: | |
Category: | In Conference |
Web Links: | DOI |
IEEE |
BibTeX
@incollection{Modayil+al:SMC12, author = {Joseph Modayil and Adam White and Patrick M. Pilarski and Richard S. Sutton}, title = {Acquiring a broad range of empirical knowledge in real time by temporal-difference learning}, Pages = {1903-1910}, booktitle = {International Conference on Systems, Man, and Cybernetics (SMC)}, year = 2012, }Last Updated: November 10, 2020
Submitted by Sabina P