Optimistic Active Learning using Mutual Information
Full Text:
active.pdf
An ``active learning system'' will sequentially decide which unlabeled
instance to label, with the goal of efficiently gathering the information
necessary to produce a good classifier. Some such systems greedily select the
next instance based only on properties of that instance and the few currently
labeled points --- eg, selecting the one closest to the current classification
boundary. Unfortunately, these approaches ignore the valuable information
contained in the other unlabeled instances, which can help identify a good
classifier much faster. For the previous approaches that do exploit this
unlabeled data, this information is mostly used in a conservative way. One
common property of the approaches in the literature is that the active learner
sticks to one single query selection criterion in the whole process. We
propose a system, Mm+M, that selects the query instance that is able to
provide the maximum conditional mutual information about the labels of the
unlabeled instances, given the labeled data, in an optimistic way. This
approach implicitly exploits the discriminative partition information
contained in the unlabeled data. Instead of using one selection criterion,
Mm+M also employs a simple on-line method that changes its selection rule when
it encounters an ``unexpected label''. Our empirical results demonstrate that
this new approach works effectively.
Citation
Y. Guo,
R. Greiner.
"Optimistic Active Learning using Mutual Information".
International Joint Conference on Artificial Intelligence (IJCAI), Hyderabad, India, January 2007.
Keywords: |
active learning, machine learning, mutual information |
Category: |
In Conference |
Web Links: |
Extra Information |
BibTeX
@incollection{Guo+Greiner:IJCAI07,
author = {Yuhong Guo and Russ Greiner},
title = {Optimistic Active Learning using Mutual Information},
booktitle = {International Joint Conference on Artificial Intelligence
(IJCAI)},
year = 2007,
}
Last Updated: March 31, 2007
Submitted by Russ Greiner