View Publication

Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots

Banafsheh Rafiee
Sina Ghiassian, University of Alberta
Adam White
Richard S. Sutton, Department of Computing Science, University of Alberta

The ability to continually make predictions about the world may be central to intelligence. Off-policy learning and general value functions (GVFs) are well-established algorithmic techniques for learning about many signals while interacting with the world. In the past couple of years, many ambitious works have used off-policy GVF learning to improve control performance in both simulation and robotic control tasks. Many of these works use semi-gradient temporal-difference (TD) learning algorithms, like Q-learning, which are potentially divergent. In the last decade, several TD learning algorithms have been proposed that are convergent and computationally efficient, but not much is known about how they perform in practice, especially on robots. In this work, we perform an empirical comparison of modern off-policy GVF learning algorithms on three different robot platforms, providing insights into their strengths and weaknesses. We also discuss the challenges of conducting fair comparative studies of off-policy learning on robots and develop a new evaluation methodology that is successful and applicable to a relatively complicated robot domain.

Citation

B. Rafiee, S. Ghiassian, A. White, R. Sutton. "Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots". Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), (ed: Edith Elkind, Manuela Veloso, Noa Agmon, Matthew E. Taylor), pp 332-340, May 2019.

Keywords:	artificial intelligence, robotics, reinforcement learning, off-policy learning, temporal-difference learning, general value functions
Category:	In Conference
Web Links:	ACM Digital Library

BibTeX

@incollection{Rafiee+al:AAMAS19,
  author = {Banafsheh Rafiee and Sina Ghiassian and Adam White and Richard S.
    Sutton},
  title = {Prediction in Intelligence: An Empirical Comparison of Off-policy
    Algorithms on Robots},
  Editor = {Edith Elkind, Manuela Veloso, Noa Agmon, Matthew E. Taylor},
  Pages = {332-340},
  booktitle = {Joint Conference on Autonomous Agents and Multi-Agent Systems
    (AAMAS)},
  year = 2019,
}

Last Updated: February 24, 2020
Submitted by Sabina P

Not Logged In

PapersDB

Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots

Citation

BibTeX