Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking
- Andrew Foss, University of Alberta
- Osmar R. Zaiane, University of Alberta (Database)
- Sandra Zilles
This paper introduces a new extension of outlier detection approaches and a new concept, class separation through variance. We show that accumulating information about the outlierness of points in multiple subspaces leads to a ranking in which classes with differing variance naturally tend to separate. Exploiting this leads to a highly effective and efficient unsupervised class separation approach, especially useful in the difficult case of heavily overlapping distributions. Unlike typical outlier detection algorithms, this method can be applied beyond the ‘rare classes’ case with great success. Two novel algorithms that implement this approach are provided. Additionally, experiments show that the novel methods typically outperform other state-of-the-art outlier detection methods on high dimensional data such as Feature Bagging, SOE1, LOF, ORCA and Robust Mahalanobis Distance and competes even with the leading supervised classification methods.
Citation
A. Foss, O. Zaiane, S. Zilles. "Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking". IEEE International Conference on Data Mining (ICDM), Miami, USA, pp 139-148, December 2009.Keywords: | Outlier Detection, Classification, Subspaces |
Category: | In Conference |
Web Links: | IEEE |
BibTeX
@incollection{Foss+al:ICDM09, author = {Andrew Foss and Osmar R. Zaiane and Sandra Zilles}, title = {Unsupervised Class Separation of Multivariate Data through Cumulative Variance-based Ranking}, Pages = {139-148}, booktitle = {IEEE International Conference on Data Mining (ICDM)}, year = 2009, }Last Updated: January 15, 2020
Submitted by Sabina P