The effect of planning shape on dyna-style planning in high-dimensional state spaces
- Zach Holland
- Erik Talvitie
- Michael Bowling, University of Alberta
Dyna is an architecture for reinforcement learning agents that interleaves planning, acting, and learning in an online setting. Dyna aims to make fuller use of limited experience to achieve better performance with fewer environmental interactions. In Dyna, the environment model is typically used to generate one-step transitions from selected start states. We applied one-step Dyna to several games from the Arcade Learning Environment and found that the model-based updates offered little benefit, even with a perfect model. However, when the model was used to generate longer trajectories of simulated experience, performance improved dramatically. This observation also holds when using a model that is learned from experience; even though the learned model is flawed, it can still be used to accelerate learning.
Citation
Z. Holland, E. Talvitie, M. Bowling. "The effect of planning shape on dyna-style planning in high-dimensional state spaces". Workshop on Prediction and Generative Modeling in Reinforcement Learning, July 2018.Keywords: | |
Category: | In Workshop |
Web Links: | ICML |
BibTeX
@misc{Holland+al:18, author = {Zach Holland and Erik Talvitie and Michael Bowling}, title = {The effect of planning shape on dyna-style planning in high-dimensional state spaces}, booktitle = {Workshop on Prediction and Generative Modeling in Reinforcement Learning}, year = 2018, }Last Updated: October 29, 2020
Submitted by Sabina P