Publications by Sutton, Richard S.

In Journal (refereed)

1.	J. Travnik, K. Mathewson, R. Sutton, P. Pilarski. "Reactive reinforcement learning in asynchronous environments". Frontiers in Robotics and AI, 5, pp n/a, June 2018.

2.	H. Yu, A. Mahmood, R. Sutton. "On Generalized Bellman Equations and Temporal-Difference Learning". Journal of Machine Learning Research (JMLR), 19(48), pp 1-49, January 2018.

3.	A. Edwards, M. Dawson, J. Hebert, C. Sherstan, R. Sutton, K. Chan, P. Pilarski. "Application of Real-time Machine Learning to Myoelectric Prosthesis Control: A Case Series in Adaptive Switching". Prosthetics and Orthotics International, 40(5), pp 573â€“581, October 2016.

4.	R. Sutton, A. Mahmood, M. White. "An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning". Journal of Machine Learning Research (JMLR), (ed: Shie Mannor), 17(73), pp 1-29, January 2016.

5.	H. Seije, A. Mahmood, P. Pilarski, M. Machado, R. Sutton. "True Online Temporal-Difference Learning". Journal of Machine Learning Research (JMLR), 17(145), pp n/a, January 2016.

6.	J. Modayil, A. White, R. Sutton. "Multi-timescale Nexting in a Reinforcement Learning Robot". Adaptive Behavior, 22(2), pp 146-160, April 2014.

7.	P. Pilarski, M. Dawson, T. Degris, J. Carey, K. Chan, J. Hebert, R. Sutton. "Adaptive Artificial Limbs: A Real-time Approach to Prediction and Anticipation". IEEE Robotics and Automation Magazine, 20(1), pp 53-64, March 2013.

8.	P. Stone, R. Sutton, G. Kuhlmann. "Reinforcement Learning for RoboCup-Soccer Keepaway". Adaptive Behavior, (13(3),), pp pp 165-188, March 2005.

9.	R. Sutton, D. Precup, S. Singh. "Between MDPs and Semi-MDPs: A Framework for Temporal Abstractions in Reinforcement Learning". Artificial Intelligence (AIJ), 112, pp 181-211, January 1999.

10.	J. Santamaria, R. Sutton, A. Ram. "Experiments With Reinforcement Learning in Problems With Continuous State and Action Spaces". Adaptive Behavior, 2, pp 163-218, January 1998.

11.	R. Sutton. "On the Significance of Markov Decision Processes". Artificial Neural Networks - ICANN'97, (ed: W. Gerstner, A Germond, M. Hasler, J-D Nicoud), pp 273-282, January 1997.

12.	S. Singh, R. Sutton. "Reinforcement Learning With Replacing Eligibility Traces". Machine Learning Journal (MLJ), (22), pp 123-158, January 1996.

13.	R. Sutton. "Introduction: The Challenge of Reinforcement Learning". Machine Learning Journal (MLJ), (ed: R.Sutton), 8(3-4), pp 225-227, January 1992.

14.	R. Sutton. "Machines That Learn and Mimic the Brain". ACCESS, GTE's Journal of Science and Technology, January 1992.

15.	R. Sutton. "First Results With Dyna: An Integrated Architecture for Learning, Planning, and Reacting". Neural Networks for Control, (ed: Miller T, Sutton R. S., Werbos P.), pp 179-189, January 1990.

16.	A. Barto, R. Sutton, C. Watkins. "Learning and Sequential Decision Making". Learning and Computational Neuroscience, (ed: M. Gabriel, J.W. Moore), pp 539-602, January 1990.

17.	R. Sutton, A. Barto. "Time-Derivative Models of Pavlovian Reinforcement". Learning and Computational Neuroscience, (ed: M. Gabriel, J.W. Moore), pp 497-537, January 1990.

18.	R. Sutton. "Learning to Predict by the Methods of Temporal Differences". Machine Learning Journal (MLJ), 3(1), pp 9-44, January 1988.

19.	O. Selfridge, R. Sutton, C. Anderson. "Selected Bibliography on Connectionism". Evolution Learning and cognition, (ed: Y.C. Lee), pp 391-403, January 1988.

20.	J. Moore, J. Desmond, N. Berthier, D. Blazis, R. Sutton, A. Barto. "Simulation of the classically conditioned nictitating membrane response by a neuron-like adaptive element: Response topography, neuronal firing, and interstimulus intervals". Behavioural Brain Research, (21), pp 143-154, May 1986.

21.	A. Barto, R. Sutton. "Neural problem solving". Synaptic Modification, Neuron Selectivity, and Nervous System Organization, (ed: W.B. Levy, J.A. Anderson), pp 123-152, May 1985.

22.	A. Barto, R. Sutton, C. Anderson. "Neuron-like adaptive elements that can solvedifficult learning control problems". IEEE Transactions on Systems, Man, and Cybernetics, SMC-13(5), pp 834-846, January 1985.

23.	A. Barto, C. Anderson, R. Sutton. "Simulation of anticipatory responses in classical conditioning by a neuron-like adaptive element". Behavioural Brain Research, pp 221-235, January 1985.

In Conference (refereed)

24.	Y. Wan, M. Zaheer, R. Sutton, A. White, M. White. "Planning with Expectation Models". International Joint Conference on Artificial Intelligence (IJCAI), (ed: Sarit Kraus), pp 3649-3655, August 2019.

25.	B. Rafiee, S. Ghiassian, A. White, R. Sutton. "Prediction in Intelligence: An Empirical Comparison of Off-policy Algorithms on Robots". Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), (ed: Edith Elkind, Manuela Veloso, Noa Agmon, Matthew E. Taylor), pp 332-340, May 2019.

26.	A. Kearney, A. Koop, C. Sherstan, J. GÃ¼nther, R. Sutton, P. Pilarski, M. Taylor. "Evaluating Predictive Knowledge". AAAI Fall Symposium, pp 43-46, October 2018.

27.

C. Sherstan, B. Bennett, K. Young, D. Ashley, A. White, M. White, R. Sutton. "Comparing Direct and Indirect Temporal-Difference Methods for Estimating the Variance of the Return". Conference on Uncertainty in Artificial Intelligence (UAI), (ed: Amir Globerson and Ricardo Silva), pp 63-72, August 2018.

28.	H. Seijen, A. Mahmood, P. Pilarski, R. Sutton. "An empirical evaluation of True Online TD(lambda)". European Workshop on Reinforcement Learning (EWRL), July 2015.

29.	K. Chan, M. Dawson, A. Edwards, J. Hebert, R. Sutton, P. Pilarski. "Adaptive Switching in Practice: Improving Myoelectric Prosthesis Performance through Reinforcement Learning". Myoelectric Control Symposium, pp 66-70, August 2014.

30.	A. Edwards, A. Kearney, M. Dawson, R. Sutton, P. Pilarski. "Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb". Multidisciplinary Conference on Reinforcement Learning and Decision Making, September 2013.

31.	A. White, J. Modayil, R. Sutton. "Scaling Life-long Off-policy Learning". Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-Epi, Osaka, Japan, August 2013.

32.	P. Pilarski, T. Dick, R. Sutton. "Real-time prediction learning for the simultaneous actuation of multiple prosthetic joints". International Conference on Rehabilitation Robotics (ICORR), June 2013.

33.	D. Silver, R. Sutton, M. MÃ¼ller. "Temporal-Difference Search in Computer Go". ICAPS, (ed: Daniel Borrajo, Subbarao Kambhampati, Angelo Oddi, Simone Fratini), pp 486-487, June 2013.

34.	J. Modayil, A. White, P. Pilarski, R. Sutton. "Acquiring a broad range of empirical knowledge in real time by temporal-difference learning". International Conference on Systems, Man, and Cybernetics (SMC), Seoul, South Korea, pp 1903-1910, October 2012.

35.	J. Modayil, A. White, R. Sutton. "Multi-timescale Nexting in a Reinforcement Learning Robot". International Conference on Simulation of Adaptive Behavior (SAB), Odense, Denmark, (ed: Ziemke T., Balkenius C., Hallam J.), pp 299-309, August 2012.

36.	T. Degris, M. White, R. Sutton. "Linear Off-Policy Actor-Critic". International Conference on Machine Learning (ICML), pp n/a, June 2012.

37.	S. Bhatnagar, R. Sutton, M. Ghavamzadeh, M. Lee. "Incremental Natural Actor-Critic Algorithms". Neural Information Processing Systems (NIPS), December 2007.

38.	D. Silver, R. Sutton, M. Mueller. "Reinforcement Learning of Local Shape in the Game of Go". International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, August 2007.

39.	R. Sutton, A. Koop, D. Silver. "On the Role of Tracking in Stationary Environments". International Conference on Machine Learning (ICML), April 2007.

40.	A. Geramifard, M. Bowling, M. Zinkevich, R. Sutton. "iLSTD: Eligibility Traces and Convergence Analysis". Neural Information Processing Systems (NIPS), pp To appear (8 pages), March 2007.

41.	A. Geramifard, M. Bowling, R. Sutton. "Incremental least-squares temporal difference learning,". National Conference on Artificial Intelligence (AAAI), Boston, Massachusetts, USA, pp 356-361, January 2006.

42.	B. Tanner, R. Sutton. "TD(lambda) Networks: Temporal-Difference Networks With Eligibility Traces". International Conference on Machine Learning (ICML), Bonn, Germany, August 2005.

43.	B. Tanner, R. Sutton. "Temporal-Difference Networks With History". International Joint Conference on Artificial Intelligence (IJCAI), Edinburgh, Scotland, August 2005.

44.	E. Rafols, M. Ring, R. Sutton, B. Tanner. "Using Predictive Representations to Improve Generalization in Reinforcement Learning". International Joint Conference on Artificial Intelligence (IJCAI), Edinburgh, Scotland, August 2005.

45.	D. Precup, R. Sutton, C. Paduraru, A. Koop, S. Singh. "Off-Policy Learning With Recognizers". Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, January 2005.

46.	R. Sutton, E. Rafols, A. Koop. "Temporal Abstraction in Temporal-Difference Networks". Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, January 2005.

47.	R. Sutton, B. Tanner. "Temporal-Difference Networks". Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, (ed: MIT Press), January 2005.

48.	D. Precup, R. Sutton, S. Dasgupta. "Off-Policy Temporal-Difference Learning With Function Approximation". International Conference on Machine Learning (ICML), Williams College, pp 417-424, January 2001.

49.	M. Littman, R. Sutton, S. Singh. "Predictive Representations of State". Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, January 2001.

50.	P. Stone, R. Sutton. "Scaling Reinforcement Learning Toward RoboCup Soccer". International Conference on Machine Learning (ICML), Williams College, January 2001.

51.	D. Precup, R. Sutton, S. Singh. "Eligibility Traces for Off-Policy Policy Evaluation". International Conference on Machine Learning (ICML), Stanford University, pp 759-766, January 2000.

52.	R. Sutton, S. Singh, D. Precup, B. Ravindran. "Improved Switching Among Temporally Abstract Actions". Neural Information Processing Systems (NIPS), Denver, CO, USA, pp 1066-1072, January 1999.

53.	R. Moll, A. Barto, T. Perkins, R. Sutton. "Learning Instance-Independent Value Functions to Enhance Local Search". Neural Information Processing Systems (NIPS), Denver, CO, USA, pp 1017-1023, January 1999.

54.	R. Sutton. "Open Theoretical Questions in Reinforcement Learning". Conference on Learning Theory (COLT), pp 11-17, January 1999.

55.	R. Sutton, D. McAllester, S. Singh, Y. Mansour. "Policy Gradient Methods for Reinforcement Learning With Function Approximation". Neural Information Processing Systems (NIPS), Denver, CO, USA, pp 1057-1063, January 1999.

56.	R. Sutton, D. Precup, S. Singh. "Intra-Option Learning About Temporally Abstract Actions". International Conference on Machine Learning (ICML), Madison, Wisconsin USA, pp 556-564, January 1998.

57.	D. Precup, R. Sutton. "Multi-Time Models for Temporally Abstract Planning". Neural Information Processing Systems (NIPS), Denver, CO, USA, pp 1050-1056, January 1998.

58.	D. Precup, R. Sutton, S. Singh. "Theoretical Results on Reinforcement Learning With Temporally Abstract Options". European Conference on Machine Learning (ECML), Chemnitz, Germany, pp 382-393, January 1998.

59.	A. McGovern, R. Sutton, A. Fagg. "Roles of Macro-Actions in Accelerating Reinforcement Learning". Grace Hopper Celebration of Women in Computing, pp 13-17, September 1997.

60.	D. Precup, R. Sutton. "Exponentiated Gradient Methods for Reinforcement Learning". International Conference on Machine Learning (ICML), Nashville, pp 272-277, July 1997.

61.	D. Precup, R. Sutton. "Multi-Time Models for Reinforcement Learning". International Conference on Machine Learning (ICML), Nashville, July 1997.

62.	D. Precup, R. Sutton, S. Singh. "Planning with Closed-Loop Macro Actions". National Conference on Artificial Intelligence (AAAI), Providence, Rhode Island, pp 73-76, May 1997.

63.	R. Sutton. "Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding". Neural Information Processing Systems (NIPS), pp 1038-1044, January 1996.

64.	R. Sutton. "TD Models: Modeling the World at a Mixture of Time Scales". International Conference on Machine Learning (ICML), pp 531-539, January 1995.

65.	R. Sutton, S. Whitehead. "Online Learning With Random Representations". International Conference on Machine Learning (ICML), Amherst, MA, USA, (ed: M. Kaufmann), pp 314-321, January 1993.

66.	T. Sanger, R. Sutton, C. Matheus. "Iterative Construction of Sparse Polynomial Approximations". Neural Information Processing Systems (NIPS), Denver, CO, USA, December 1992.

67.	M. Gluck, P. Glauthier, R. Sutton. "Adaptation of Cue-Specific Learning Rates in Network Models of Human Category Learning". Conference of the Cognitive Science Society (CogSci), pp 540-545, July 1992.

68.	A. Barto, R. Sutton, C. Watkins. "Sequential decision problems and neural networks". Neural Information Processing Systems (NIPS), Denver, CO, USA, May 1992.

69.	R. Sutton. "Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta". National Conference on Artificial Intelligence (AAAI), January 1992.

70.	R. Sutton. "Reinforcement Learning Architectures". ISKIT, pp 211-216, January 1992.

71.	R. Sutton. "Dyna, an Integrated Architecture for Learning, Planning and Reacting". National Conference on Artificial Intelligence (AAAI), pp 160-163, January 1991.

72.	R. Sutton. "Reinforcement Learning Architectures for Animats". Conference on Simulation of Adaptive Behavior (CSAB), January 1991.

73.	R. Sutton, A. Barto, R. Williams. "Reinforcement Learning is Direct Adaptive Optimal Control". American Control Conference (ACC), January 1991.

74.	S. Whitehead, R. Sutton, D. Ballard. "Advances in Reinforcement Learning and Their Implications for Intelligent Control". IEEE, pp 1289-1297, January 1990.

75.	R. Sutton. "Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming". International Conference on Machine Learning (ICML), Austin, Texas, USA, pp 216-224, January 1990.

76.	C. Anderson, J. Franklin, R. Sutton. "Learning a Nonlinear Model of a Manufacturing Process Using Multilayer Connectionist Networks". IEEE, pp 404-409, January 1990.

77.	J. Franklin, R. Sutton, C. Anderson. "Application of Connectionist Learning Methods to Manufacturing Process Monitoring". IEEE, pp 709-712, January 1989.

78.	R. Sutton. "Artificial Intelligence as a Control Problem: Comments on the Relationship Between Machine Learning and Intelligent Control". IEEE, January 1989.

79.	R. Sutton, A. Barto. "A Temporal-Difference Model of Classical Conditioning". Conference of the Cognitive Science Society (CogSci), pp 355-378, January 1987.

80.	R. Sutton. "Two problems with backpropagation and other steepest-descent learning procedures for networks". Conference of the Cognitive Science Society (CogSci), pp 823-831, May 1986.

81.	J. Moore, J. Desmond, N. Berthier, D. Blazis, R. Sutton, A. Barto. "Connectionist learning in real time: Sutton-Barto adaptive element and classical conditioning of the nictitating membrane response". Conference of the Cognitive Science Society (CogSci), pp 318-322, May 1985.

82.	R. Sutton, B. Pinette. "The learning of world models by connectionist networks". Conference of the Cognitive Science Society (CogSci), pp 54-64, May 1985.

83.	O. Selfridge, R. Sutton, A. Barto. "Training and tracking in robotics". International Joint Conference on Artificial Intelligence (IJCAI), pp 670-672, May 1985.

In Workshop

84.	V. Veeriah, P. Pilarski, R. Sutton. "Face valuing: Training user interfaces with facial expressions and reinforcement learning". Workshop on Interactive Machine Learning, July 2016.

85.	P. Pilarski, R. Sutton, K. Mathewson. "Prosthetic Devices as Goal-Seeking Agents". Present and Future of Non-invasive Peripheral-Nervous-System Machine Interfaces: Progress in Restori, August 2015.

86.	R. Sutton. "Learning distributed, searchable, internal models". Distributed Artificial Intelligence Workshop, May 2007.

87.	P. Stone, R. Sutton, S. Singh. "Reinforcement Learning for 3 vs. 2 Keepaway". RoboCup, January 2001.

88.	A. McGovern, D. Precup, B. Ravindran, S. Singh, R. Sutton. "Hierarchical Optimal Control of MDPs". Yale Workshop on Adaptive and Learning Systems, pp 186-191, January 1998.

89.	R. Mehra, B. Ravichandran, R. Sutton. "Adaptive Intelligent Scheduling for ATM Networks". Yale Workshop on Adaptive and Learning Systems, pp 106-111, January 1996.

90.	L. Kuvayev, R. Sutton. "Model-Based Reinforcement Learning With An Approximate, Learned Model". Yale Workshop on Adaptive and Learning Systems, pp 101-105, January 1996.

91.	R. Sutton, S. Singh. "On Bias and Step Size in Temporal-Difference Learning". Yale Workshop on Adaptive and Learning Systems, pp 91-96, January 1994.

92.	A. Barto, R. Sutton. "Gain Adaptation Beats Least Squares?". Yale Workshop on Adaptive and Learning Systems, pp 161-166, January 1992.

93.	R. Sutton, C. Matheus. "Learning Polynomial Functions by Feature Construction". IWML, January 1991.

94.	R. Sutton. "Planning by Incremental Dynamic Programming". International Workshop on Machine Learning, pp 353-357, January 1991.

95.	R. Sutton. "Artificial intelligence by dynamic programming". Yale Workshop on Adaptive and Learning Systems, pp 89-95, May 1990.

96.	R. Sutton. "Convergence Theory for a New Kind of Prediction Learning". WCLT, pp 421-422, January 1988.

Other Categories

97.	E. Ludvig, R. Sutton, E. Verbeek, J. Neufeld, E. Kehoe. "Stimulus representation and response timing in a temporal-difference (TD) model of classical conditioning". Pavlovian Society, October 2007.

98.	M. Littman, R. Sutton, S. Singh. "Predictive Representations of State". Predictive Representations of World Knowledge, January 2002.

99.	R. Sutton. "Reinforcement learning". MIT Encyclopedia of the Cognitive Sciences, MIT Press, (ed: R. Wilson F. Keil), pp 715-717, May 1999.

100.	A. McGovern, R. Sutton. "Macro-Actions in Reinforcement Learning: An Empirical Analysis". Technical Report, January 1998.

101.	R. Sutton, A. Barto. "Reinforcement Learning: An Introduction". MIT Press, January 1998.

102.	D. Precup, R. Sutton, S. Singh. "Notes". National Conference on Artificial Intelligence (AAAI), Providence, Rhode Island, January 1997.

103.	D. Precup, R. Sutton. "Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning". Technical Report, January 1996.

104.	R. Sutton. "Reinforcement Learning". Reinforcement Learning, Reprinting of a special issue of Machine Learning Journal, Kluwer Academic Press, (ed: Sutton R. S.), May 1992.

105.	R. Sutton. "Integrated Modeling and Control Based on Reinforcement Learning and Dynamic Programming". January 1991.

106.	W. Miller, R. Sutton, P. Werbos. "Neural Networks for Control". MIT Press, (ed: W. Miller, R. Sutton, P. Werbos.), January 1991.

107.	J. Franklin, R. Sutton, C. Anderson, O. Selfridge, D. Schwartz. "Connectionist Learning Control at GTE Laboratories". pp 242-253, February 1990.

108.	R. Sutton. "Implementation Details of the TD(lambda) Procedure for the Case of Vector Predictions and Backpropagation". Technical Report, January 1989.

109.	R. Sutton. "NADALINE: A Normalized Adaptive Linear Element That Learns Efficiently". Technical Report, January 1988.

Not Logged In

PapersDB

Publications by Sutton, Richard S.

In Journal (refereed)

In Conference (refereed)

In Workshop

Other Categories