Not Logged In

Advantage of Integration in Big Data: Feature Generation in Multi-Relational Databases for Imbalanced Learning

Full Text: BigData16-2.pdf PDF

Most real world applications comprise databases having multiple tables. It becomes further complicated in the realm of Big Data where related information is spread over different data repositories. However, data mining techniques are usually applied on a single flat table. This work focuses on generating a mining table by aggregating information from multiple local tables and external data sources and automatically generating potentially discriminant features. It extends data aggregation techniques by navigating paths where a single table is traversed multiple times. Such paths are not considered by existing techniques, which results in the loss of several attributes. Our framework also prevents leakage of the class information by avoiding features built after the knowledge of the class label. Experiments are performed on transactional data of a U.S. consumer electronics retailer to predict causes of product returns. In addition, we augmented the dataset with Suppliers information and Reviews to show the value of data integration. The results show that our technique improves classification accuracy and generates discriminant features that mitigate the impact of class imbalance.

Citation

F. Ahmed, M. Samorani, C. Bellinger, O. Zaiane. "Advantage of Integration in Big Data: Feature Generation in Multi-Relational Databases for Imbalanced Learning ". IEEE International Conference on Big Data, Washington, USA, December 2016.

Keywords: Data integration, Feature construction, Classification, Class imbalance
Category: In Conference
Web Links: Webdocs

BibTeX

@incollection{Ahmed+al:16,
  author = {Farrukh Ahmed and Michele Samorani and Colin Bellinger and Osmar R.
    Zaiane},
  title = {Advantage of Integration in Big Data: Feature Generation in
    Multi-Relational Databases for Imbalanced Learning },
  booktitle = {IEEE International Conference on Big Data},
  year = 2016,
}

Last Updated: November 05, 2019
Submitted by Sabina P

University of Alberta Logo AICML Logo