Not Logged In

Automatic Generation of Relational Attributes: An Application to Product Returns

Full Text: BigData16-1.pdf PDF

Although statistical and machine learning methods require the input data to be in a tabular format, in real-world applications data are often stored across several tables in a relational database. How to build a single mining table from a relational database is a critical pre-processing step of any classification method, because including the right attributes may dramatically boost the accuracy of the classifier. We propose a methodology and implement a software program, Dataconda, to automatically mine a relational database. The user selects a class attribute contained in a table of the database and the procedure builds and selects predictors by exploring the whole database and aggregating information, without any user intervention. For example, our procedure may find that the best predictor for “product return” is the proportion of products returned by the same customer in the past, even if the user has not built any such attribute. Our procedure produces more expressive attributes than existing methods. Our experiments on the ISMS Durable Goods Datasets, a publicly available data set of product returns in retailing, suggest that our method allows new knowledge to emerge.

Citation

M. Samorani, F. Ahmed, O. Zaiane. "Automatic Generation of Relational Attributes: An Application to Product Returns". IEEE International Conference on Big Data, Washington, USA, December 2016.

Keywords: Feature construction, Knowledge discovery, Software tools
Category: In Conference
Web Links: Webdocs

BibTeX

@incollection{Samorani+al:16,
  author = {Michele Samorani and Farrukh Ahmed and Osmar R. Zaiane},
  title = {Automatic Generation of Relational Attributes: An Application to
    Product Returns},
  booktitle = {IEEE International Conference on Big Data},
  year = 2016,
}

Last Updated: November 05, 2019
Submitted by Sabina P

University of Alberta Logo AICML Logo