Effective Classification Learning
This thesis addresses the problem of learning a classification rule from random examples. We first consider the
problem of learning a target concept with guaranteed accuracy and reliability given that the target belongs to
some known class C; a task commonly referred to as probably approximately correct (pac) learning. Previous
work on this problem assumes a fixedsamplesize approach to data collection that fails to achieve practical
dataefficiency in most applications. In this thesis we consider an alternative ``sequential'' approach where
the learner observes training examples oneatatime and decides online when to stop training. We prove
that sequential learning strategies can paclearn with fewer training examples than previous fixedsamplesize
approaches, even while incurring minimal computational overhead. Moreover, these new strategies use many
times fewer training examples in practical case studies.
Next, we study the average error of a learner's hypotheses as a function of training sample size---its so
called learning curve. Specifically, we investigate the best learning curve that can be achieved in the worst
case over a class of concepts C. Previous work has shown that rational convergence to zero error can always
be obtained in this model, but it is impossible to do better in the worst case. However, recent empirical
studies have shown that exponential convergence can be achieved in many experimental settings. We explain
this discrepancy by noting that the previous analysis is nonuniform in training sample size, and prove that
a uniform analysis predicts the exact same dichotomy as observed in the experimental studies. Overall, this
thesis shows how the worst case theory of classification learning can be brought closer to practice.
Citation
D. Schuurmans.
"Effective Classification Learning".
Value of Information in Inference, Learning and Decision-Making, January 1996.
Keywords: |
Effective, learning, machine learning |
Category: |
|
BibTeX
@incollection{Schuurmans:VOI-NIPS96,
author = {Dale Schuurmans},
title = {Effective Classification Learning},
booktitle = {Value of Information in Inference, Learning and Decision-Making},
year = 1996,
}
Last Updated: April 24, 2007
Submitted by William Thorne