View Publication

Characterizing the Generalization Performance of Model Selection Strategies

Dale Schuurmans, AICML
Lyle H. Ungar, Computer and Information Science - University of Pennsylvania
Dean P. Foster, Department of Statistics, University of Pennsylvania

Full Text: schuurmans97characterizing.pdf

We investigate the structure of model selection problems via the bias/variance decomposition. In particular, we characterize the essential aspects of a model selection task by the bias and variance profiles it generates over the sequence of hypoth� esis classes. With this view, we develop a new understanding of complexity�penalization meth� ods: First, the penalty terms can be interpreted as postulating a particular profile for the variances as a function of model complexity---if the postu� lated and true profiles do not match, then system� atic under�fitting or over�fitting results, depend� ing on whether the penalty terms are too large or too small. Second, we observe that it is generally best to penalize according to the true variances of the task, and therefore no fixed penalization strategy is optimal across all problems. We then use this characterization to introduce the notion of easy versus hard model selection problems. Here we show that if the variance profile grows too rapidly in relation to the biases, then standard model selection techniques become prone to sig� nificant errors. This can happen, for example, in regression problems where the independent vari� ables are drawn from wide�tailed distributions. To counter this, we discuss a new model selec� tion strategy that dramatically outperforms stan� dard complexity�penalization and hold�out meth� ods on these hard tasks.

Citation

D. Schuurmans, L. Ungar, D. Foster. "Characterizing the Generalization Performance of Model Selection Strategies". International Conference on Machine Learning (ICML), Nashville, January 1997.

Keywords:	generalization, selection, machine learning
Category:	In Conference

BibTeX

@incollection{Schuurmans+al:ICML97,
  author = {Dale Schuurmans and Lyle H. Ungar and Dean P. Foster},
  title = {Characterizing the Generalization Performance of Model Selection
    Strategies},
  booktitle = {International Conference on Machine Learning (ICML)},
  year = 1997,
}

Last Updated: August 16, 2007
Submitted by Russ Greiner

Not Logged In

PapersDB

Characterizing the Generalization Performance of Model Selection Strategies

Citation

BibTeX