Not Logged In

Model Selection for Semi-Supervised Clustering

Full Text: PourrajabiMCZSG14.pdf PDF

Although there is a large and growing literature that tackles the semi-supervised clustering problem (i.e., using some labeled objects or cluster-guiding constraints like “must-link” or “cannot-link”), the evaluation of semi-supervised clustering approaches has rarely been discussed. The application of cross-validation techniques, for example, is far from straightforward in the semi-supervised setting, yet the problems associated with evaluation have yet to be addressed. Here we summarize these problems and provide a solution. Furthermore, in order to demonstrate practical applicability of semi-supervised clustering methods, we provide a method for model selection in semi-supervised clustering based on this sound evaluation procedure. Our method allows the user to select, based on the available information (labels or constraints), the most appropriate clustering model (e.g., number of clusters, density-parameters) for a given problem.

Citation

M. Pourrajabi, D. Moulavi, R. Campello, A. Zimek, J. Sander, R. Goebel. "Model Selection for Semi-Supervised Clustering". International Conference on Extending Database Technology (EBDT), Athens, Greece, pp 331-342, March 2014.

Keywords:  
Category: In Conference
Web Links: EBDT

BibTeX

@incollection{Pourrajabi+al:(EBDT)14,
  author = {Mojgan Pourrajabi and Davoud Moulavi and Ricardo J. G. B. Campello
    and Arthur Zimek and Jörg Sander and Randy Goebel},
  title = {Model Selection for Semi-Supervised Clustering},
  Pages = {331-342},
  booktitle = {International Conference on Extending Database Technology
    (EBDT)},
  year = 2014,
}

Last Updated: June 19, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo