Not Logged In

COFI Approach for Mining Frequent Itemsets Revisited

Full Text: dmkd04-2.pdf PDF

The COFI approach for mining frequent itemsets, introduced recently, is an efficient algorithm that was demonstrated to outperform state-of-the-art algorithms on synthetic data. For instance, COFI is not only one order of magnitude faster and requires significantly less memory than the popular FP-Growth, it is also very effective with extremely large datasets, better than any reported algorithm. However, COFI has a significant drawback when mining dense transactional databases which is the case with some real datasets. The algorithm performs poorly in these cases because it ends up generating too many local candidates that are doomed to be infrequent. In this paper, we present a new algorithm COFI* for mining frequent itemsets. This novel algorithm uses the same data structure COFI-tree as its predecessor, but partitions the patterns in such a way to avoid the drawbacks of COFI. Moreover, its approach uses a pseudo-Oracle to pinpoint the maximal itemsets, from which all frequent itemsets are derived and counted, avoiding the generation of candidates fated infrequent. Our implementation tested on real and synthetic data shows that COFI* algorithm outperforms state-of-the-art algorithms, among them COFI itself.

Citation

M. El-Hajj, O. Zaiane. "COFI Approach for Mining Frequent Itemsets Revisited". Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD), ACM, pp 70-75, June 2004.

Keywords:  
Category: In Workshop
Web Links: ACM Digital Library

BibTeX

@misc{El-Hajj+Zaiane:DMKD04,
  author = {Mohammad El-Hajj and Osmar R. Zaiane},
  title = {COFI Approach for Mining Frequent Itemsets Revisited},
  Publisher = "ACM",
  Pages = {70-75},
  booktitle = {Workshop on Research Issues in Data Mining and Knowledge
    Discovery (DMKD)},
  year = 2004,
}

Last Updated: February 04, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo