The Budgeted Multi-Armed Bandit Problem
- Omid Madani, Yahoo! Research
- Dan Lizotte, University of Michigan
- Russ Greiner, Dept of Computing Science; PI of AICML
The following coins problem is a version of a multi-armed bandit problem where one has to select from among a set of objects, say classifiers, after an experimentation phase that is constrained by a time or cost budget. The question is how to spend the budget. The problem involves pure exploration only, differentiating it from typical multi-armed bandit problems involving an exploration/exploitation tradeoff. It is an abstraction of the following scenarios: choosing from among a set of alternative treatments after a fixed number of clinical trials, determining the best parameter settings for a program given a deadline that only allows a fixed number of runs; or choosing a life partner in the bachelor/bachelorette TV show where time is limited. We are interested in the computational complexity of the coins problem and/or efficient algorithms with approximation guarantees.
Citation
O. Madani, D. Lizotte, R. Greiner. "The Budgeted Multi-Armed Bandit Problem". Conference on Learning Theory (COLT), Banff, Alberta, pp 643-645, July 2004.Keywords: | budgeted learning, bandit problem, machine learning |
Category: | In Conference |
BibTeX
@incollection{Madani+al:COLT04, author = {Omid Madani and Dan Lizotte and Russ Greiner}, title = {The Budgeted Multi-Armed Bandit Problem}, Pages = {643-645}, booktitle = {Conference on Learning Theory (COLT)}, year = 2004, }Last Updated: July 07, 2007
Submitted by Russ Greiner