Not Logged In

Learning to Classify Incomplete Examples

Full Text: missing.ps PS

An inductive inference system uses labeled training examples to learn a classification function for labeling subsequent unlabeled performance examples. Most theoretical analyses assume that both training and performance examples are complete, in that the value of every attribute is known to both learner and classifier. Real-world data, however, is usually incomplete. This paper addresses this discrepancy by formally analyzing the task of learning to classify incompletely specified performance examples, based on completely- and incompletely-specified training examples, respectively. This formalism requires an extension of the classical notion of concept definition, called "default concept definition" (dcd), whose classification behavior can be nonmonotonic. We first present a formal account of these dcds and show that they is similar, but not identical, to important existing ideas from both learnability and AI knowledge representation formalisms. We next define a generalization of Valiant's probabilistic model of instance presentation that allows attribute values to be hidden from the classifier. Finally, we use this model to develop an extension to the PAC learning framework that can learn dcds from examples, and prove a number of learnability results, both positive and negative, which collectively suggest the appropriate ways of treating missing attribute values.

Citation

R. Greiner, D. Schuurmans. "Learning to Classify Incomplete Examples". Conference on Learning Theory (COLT), August 1996.

Keywords: missing information, PAC, machine learning
Category: In Conference

BibTeX

@incollection{Greiner+Schuurmans:COLT96,
  author = {Russ Greiner and Dale Schuurmans},
  title = {Learning to Classify Incomplete Examples},
  booktitle = {Conference on Learning Theory (COLT)},
  year = 1996,
}

Last Updated: April 25, 2007
Submitted by Russ Greiner

University of Alberta Logo AICML Logo