Not Logged In

A Measure optimized cost-sensitive learning framework for imbalanced data classification

Full Text: BiologicallyInspired13.pdf PDF

Class imbalance is one of the challenging problems for machine learning in many real-world applications. Many methods have been proposed to address and attempt to solve the problem, including sampling and cost-sensitive learning. The latter has attracted significant attention in recent years to solve the problem, but it is difficult to determine the precise misclassification costs in practice. There are also other factors that influence the performance of the classification including the input feature subset and the intrinsic parameters of the classifier. This paper presents an effective wrapper framework incorporating the evaluation measure (AUC and G-mean) into the objective function of cost sensitive learning directly for improve the performance of classification, by simultaneously optimizing the best pair of feature subset, intrinsic parameters and misclassification cost parameter. The optimization is based on Particle Swarm Optimization (PSO).We use two different common methods, support vector machine and feed forward neural networks to evaluate our proposed framework. Experimental results on various standard benchmark datasets with different ratios of imbalance and a real-world problem show that the proposed method is effective in comparison with commonly used sampling techniques.

Citation

P. Cao, O. Zaiane, D. Zhao. " A Measure optimized cost-sensitive learning framework for imbalanced data classification". Biologically-Inspired Techniques for Knowledge Discovery and Data Mining, Advances in Data Mining an, Biologically-Inspired Techniques for Knowledge Discovery and Data Mining, Advances in Data Mining and Database Management Book Series, IGI Global, (ed: Shafiq Alam, Yun Sing Koh, and Gillian Dobbie), pp 1-24, October 2014.

Keywords:  
Category: In Book
Web Links: Webdocs

BibTeX

@inbook{Cao+al:14,
  author = {Peng Cao and Osmar R. Zaiane and Dazhe Zhao},
  title = { A Measure optimized cost-sensitive learning framework for
    imbalanced data classification},
  Booktitle = {Biologically-Inspired Techniques for Knowledge Discovery and
    Data Mining, Advances in Data Mining and Database Management Book Series},
  Publisher = {IGI Global},
  Editor = {Shafiq Alam, Yun Sing Koh, and Gillian Dobbie},
  Pages = {1-24},
  year = 2014,
}

Last Updated: October 31, 2019
Submitted by Sabina P

University of Alberta Logo AICML Logo