Model Selection for Semi-Supervised Clustering
Full Text: PourrajabiMCZSG14.pdfAlthough there is a large and growing literature that tackles the semi-supervised clustering problem (i.e., using some labeled objects or cluster-guiding constraints like “must-link” or “cannot-link”), the evaluation of semi-supervised clustering approaches has rarely been discussed. The application of cross-validation techniques, for example, is far from straightforward in the semi-supervised setting, yet the problems associated with evaluation have yet to be addressed. Here we summarize these problems and provide a solution. Furthermore, in order to demonstrate practical applicability of semi-supervised clustering methods, we provide a method for model selection in semi-supervised clustering based on this sound evaluation procedure. Our method allows the user to select, based on the available information (labels or constraints), the most appropriate clustering model (e.g., number of clusters, density-parameters) for a given problem.
Citation
M. Pourrajabi, D. Moulavi, R. Campello, A. Zimek, J. Sander, R. Goebel. "Model Selection for Semi-Supervised Clustering". International Conference on Extending Database Technology (EBDT), Athens, Greece, pp 331-342, March 2014.Keywords: | |
Category: | In Conference |
Web Links: | EBDT |
BibTeX
@incollection{Pourrajabi+al:(EBDT)14, author = {Mojgan Pourrajabi and Davoud Moulavi and Ricardo J. G. B. Campello and Arthur Zimek and Jörg Sander and Randy Goebel}, title = {Model Selection for Semi-Supervised Clustering}, Pages = {331-342}, booktitle = {International Conference on Extending Database Technology (EBDT)}, year = 2014, }Last Updated: June 19, 2020
Submitted by Sabina P