Not Logged In

Hybrid probabilistic sampling with random subspace for imbalanced data learning

Full Text: IDA2013.pdf PDF

Class imbalance is one of the challenging problems for machine learning in many real-world applications. Other issues, such as within-class imbalance and high dimensionality, can exacerbate the problem. We propose a method HPSDRS that combines two ideas: Hybrid Probabilistic Sampling technique ensemble with Diverse Random Subspace to address these issues. HPS improves the performance of traditional re-sampling algorithms with the aid of probability function, since it is not sufficient to simply manipulate the class sizes for imbalanced data with complex distribution. Moreover, DRS ensemble employs the minimum overlapping mechanism to provide diversity and weighted voting, so as to improve the generalization performance. The experimental results demonstrate that our method is efficient for learning from imbalanced data and can achieve better results than state-of-the-art methods for imbalanced data.

Citation

P. Cao, D. Zhao, O. Zaiane. "Hybrid probabilistic sampling with random subspace for imbalanced data learning". Intelligent Data Analysis: An International Journal, 18(6), pp 1089-1108, November 2014.

Keywords: Classification, class imbalance, sampling method, ensemble learning, random subspace method
Category: In Journal
Web Links: Webdocs

BibTeX

@article{Cao+al:14,
  author = {Peng Cao and Dazhe Zhao and Osmar R. Zaiane},
  title = {Hybrid probabilistic sampling with random subspace for imbalanced
    data learning},
  Volume = "18",
  Number = "6",
  Pages = {1089-1108},
  journal = {Intelligent Data Analysis: An International Journal},
  year = 2014,
}

Last Updated: October 29, 2019
Submitted by Sabina P

University of Alberta Logo AICML Logo