Publications by Munos, Remi
In Journal (refereed)
1. | A. Antos, C. Szepesvari, R. Munos. "Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path". Machine Learning Journal (MLJ), June 2007. |
2. | R. Munos, C. Szepesvari. "Finite Time Bounds for Sampling Based Fitted Value Iteration". Journal of Machine Learning Research (JMLR), March 2007. |
In Conference (refereed)
3. | S. Sriram, M. Lanctot, V. Zambaldi, J. Perolat, K. Tuyls, R. Munos, M. Bowling. "Actor-Critic Policy Optimization in Partially Observable Multiagent Environments". Neural Information Processing Systems (NIPS), (ed: Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, Roman Garnett), pp 3426-3439, December 2018. |
4. | J. Audibert, R. Munos, C. Szepesvari. "Tuning bandit algorithms in stochastic environments". Algorithmic Learning Theory (ALT), October 2007. |
5. | A. Antos, C. Szepesvari, R. Munos. "Value-Iteration Based Fitted Policy Iteration: Learning with a Single Trajectory". Symposium on Approximate Dynamic Programming and Reinforcement Learning, pp 330-337, April 2007. |
6. | A. Antos, C. Szepesvari, R. Munos. "Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path". Conference on Learning Theory (COLT), January 2006. |
7. | C. Szepesvari, R. Munos. "Finite Time Bounds for Sampling Based Fitted Value Iteration". International Conference on Machine Learning (ICML), Bonn, Germany, pp 881-886, January 2005. |