Not Logged In

Reinforcement Learning Architectures for Animats

In the first part of this paper I argue that the learning problem facing animats is essentially that which has been studied as the reinforcement learning problem---the learning of behavior by trial and error without an explicit teacher. A brief overview is presented of the development of reinforcement learning architectures over the past decade, with references to the literature. The second part of the paper presents Dyna, a class of architectures based on reinforcement learning but which go beyond trial-and-error learning. Dyna architectures include a learned internal model of the world. By intermixing conventional trial and error with hypothetical trial and error using the world model, Dyna systems can plan and learn optimal behavior very rapidly. Results are shown for simple Dyna systems that learn from trial and error while they simultaneously learn a world model and use it to plan optimal action sequences. We also show that Dyna architectures are easy to adapt for use in changing environments.

Citation

R. Sutton. "Reinforcement Learning Architectures for Animats". Conference on Simulation of Adaptive Behavior (CSAB), January 1991.

Keywords: Dyna, explicit, architectures, machine learning
Category: In Conference

BibTeX

@incollection{Sutton:CSAB91,
  author = {Richard S. Sutton},
  title = {Reinforcement Learning Architectures for Animats},
  booktitle = {Conference on Simulation of Adaptive Behavior (CSAB)},
  year = 1991,
}

Last Updated: January 04, 2007
Submitted by William Thorne

University of Alberta Logo AICML Logo