Not Logged In

Acquiring a broad range of empirical knowledge in real time by temporal-difference learning

Full Text: 06378016.pdf PDF

Several robot capabilities rely on predictions about the temporally extended consequences of a robot's behaviour. We describe how a robot can both learn and make many such predictions in real time using a standard algorithm. Our experiments show that a mobile robot can learn and make thousands of accurate predictions at 10 Hz. The predictions are about the future of all of the robot's sensors and many internal state variables at multiple time-scales. All the predictions share a single set of features and learning parameters. We demonstrate the generality of this method with an application to a different platform, a robot arm operating at 50 Hz. Here, learned predictions can be used to measurably improve the user interface. The temporally extended predictions learned in real time by this method constitute a basic form of knowledge about the dynamics of the robot's interaction with the environment. We also show how this method can be extended to express more general forms of knowledge.

Citation

J. Modayil, A. White, P. Pilarski, R. Sutton. "Acquiring a broad range of empirical knowledge in real time by temporal-difference learning". International Conference on Systems, Man, and Cybernetics (SMC), Seoul, South Korea, pp 1903-1910, October 2012.

Keywords:  
Category: In Conference
Web Links: DOI
  IEEE

BibTeX

@incollection{Modayil+al:SMC12,
  author = {Joseph Modayil and Adam White and Patrick M. Pilarski and Richard
    S. Sutton},
  title = {Acquiring a broad range of empirical knowledge in real time by
    temporal-difference learning},
  Pages = {1903-1910},
  booktitle = {International Conference on Systems, Man, and Cybernetics (SMC)},
  year = 2012,
}

Last Updated: November 10, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo