Not Logged In

Baseline: Practical control variates for agent evaluation in zero-sum domains

Agent evaluation in stochastic domains can be difficult. The commonplace approach of Monte Carlo evaluation can involve a prohibitive number of simulations when the variance of the outcome is high. In such domains, variance reduction techniques are necessary, but these techniques require careful encoding of domain knowledge. This paper introduces baseline as a simple approach to creating low variance estimators for zero-sum multi-agent domains with high outcome variance. The baseline method leverages the self play of any available agent to produce a control variate for variance reduction, subverting any extra complexity inherent with traditional approaches. The baseline method is also applicable in situations where existing techniques either require extensive implementation overhead or simply cannot be applied. Experimental variance reduction results are shown for both cases using the baseline method. Baseline is shown to surpass state-of-the-art techniques in three-player computer poker and is competitive in two-player computer poker games. Baseline also shows variance reduction in human poker and in a mock Ad Auction tournament from the Trading Agent Competition, domains where variance reduction methods are not typically employed.

Citation

J. Davidson, C. Archibald, M. Bowling. "Baseline: Practical control variates for agent evaluation in zero-sum domains". Joint Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), (ed: Maria L. Gini, Onn Shehory, Takayuki Ito, Catholijn M. Jonker), pp 1005-1012, May 2013.

Keywords:  
Category: In Conference
Web Links: ACM Digital Library

BibTeX

@incollection{Davidson+al:AAMAS13,
  author = {Joshua Davidson and Christopher Archibald and Michael Bowling},
  title = {Baseline: Practical control variates for agent evaluation in
    zero-sum domains},
  Editor = {Maria L. Gini, Onn Shehory, Takayuki Ito, Catholijn M. Jonker},
  Pages = {1005-1012},
  booktitle = {Joint Conference on Autonomous Agents and Multi-Agent Systems
    (AAMAS)},
  year = 2013,
}

Last Updated: October 29, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo