Not Logged In

Exploiting the Omission of Irrelevant Data

Full Text: greiner96exploiting.pdf PDF

Most learning algorithms work most effectively when their training data contain completely specified labeled samples. In many diagnostic tasks, however, the data will include the values of only some of the attributes; we model this as a blocking process that hides the values of those attributes from the learner. While blockers that remove the values of critical attributes can handicap a learner, this paper instead focuses on blockers that remove only irrelevant attribute values, ie, values that are not needed to classify an instance, given the values of the other unblocked attributes. We first motivate and formalize this model of ``superfluous-value blocking'', and then demonstrate that these omissions can be useful, by proving that certain classes that seem hard to learn in the general PAC model --- viz., decision trees and DNF formulae --- are trivial to learn in this setting. We also show that this model can be extended to deal with (1) theory revision (ie, modifying an existing formula); (2) blockers that occasionally include superfluous values or exclude required values; and (3) other corruptions of the training data.

Citation

R. Greiner, A. Grove, A. Kogan. "Exploiting the Omission of Irrelevant Data". International Conference on Machine Learning (ICML), pp 207-215, July 1996.

Keywords: irrelevant, machine learning, DNF, theory revision
Category: In Conference

BibTeX

@incollection{Greiner+al:ICML96,
  author = {Russ Greiner and Adam Grove and Alexander Kogan},
  title = {Exploiting the Omission of Irrelevant Data},
  Pages = {207-215},
  booktitle = {International Conference on Machine Learning (ICML)},
  year = 1996,
}

Last Updated: April 25, 2007
Submitted by Russ Greiner

University of Alberta Logo AICML Logo