Not Logged In

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of f-Regression Counterfactual Regret Minimization

Function approximation is a powerful approach for structuring large decision problems that has facilitated great achievements in the areas of reinforcement learning and game playing. Regression counterfactual regret minimization (RCFR) is a flexible and simple algorithm for approximately solving imperfect information games with policies parameterized by a normalized rectified linear unit (ReLU). In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and has a regret bound with a better dependence on the number of actions in the tabular case. We derive approximation error-aware regret bounds for -regret matching, which applies to a general class of link functions and regret objectives. These bounds recover a tighter bound for RCFR and provides a theoretical justification for RCFR implementations with alternative policy parameterizations (-RCFR), including softmax. We provide exploitability bounds for -RCFR with the polynomial and exponential link functions in zero-sum imperfect information games, and examine empirically how the link function interacts with the severity of the approximation to determine exploitability performance in practice. Although a ReLU parameterized policy is typically the best choice, a softmax parameterization can perform as well or better in settings that require aggressive approximation.

Citation

R. D'Orazio, D. Morrill, J. Wright, M. Bowling. "Alternative Function Approximation Parameterizations for Solving Games: An Analysis of f-Regression Counterfactual Regret Minimization". Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), (ed: Amal El Fallah Seghrouchni, Gita Sukthankar, Bo An, Neil Yorke-Smith), pp 339-347, May 2020.

Keywords: Regret minimization, Counterfactual regret minimization, Function approximation, Zero-sum games, Extensive-form games
Category: In Conference
Web Links: ACM Digital Library

BibTeX

@incollection{D'Orazio+al:AAMAS20,
  author = {Ryan D'Orazio and Dustin Morrill and James R. Wright and Michael
    Bowling},
  title = {Alternative Function Approximation Parameterizations for Solving
    Games: An Analysis of f-Regression Counterfactual Regret Minimization},
  Editor = {Amal El Fallah Seghrouchni, Gita Sukthankar, Bo An, Neil
    Yorke-Smith},
  Pages = {339-347},
  booktitle = {Joint Conference on Autonomous Agents and Multi-Agent Systems
    (AAMAS)},
  year = 2020,
}

Last Updated: September 10, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo