Not Logged In

Parallel Association Rule Mining with Minimum Inter-Processor Communication

Full Text: padd03.pdf PDF

Existing parallel association rule mining algorithms suffer from many problems when mining massive transactional datasets. One major problem is that most of the parallel algorithms for a shared nothing environment are Aprioribased algorithms. Apriori-based algorithms are proven to be not scalable due to many reasons, mainly: (1) the repetitive I/O disk scans, (2) the huge computation and communication involved during the candidacy generation. This paper proposes a new disk-based parallel association rule mining algorithm called Inverted Matrix, which achieves its efficiency by applying three new ideas. First, transactional data is converted into a new database layout called Inverted Matrix that prevents multiple scanning of the database during the mining phase, in which finding globally frequent patterns could be achieved in less than a full scan with random access. This data structure is replicated among the parallel nodes. Second, for each frequent item assigned to a parallel node, a relatively small independent tree is built summarizing co-occurrences. Finally, a simple and non-recursive mining process reduces the memory requirements as minimum candidacy generation and counting is needed, and no communication between nodes is required to generate all globally frequent patterns.

Citation

M. El-Hajj, O. Zaiane. "Parallel Association Rule Mining with Minimum Inter-Processor Communication". International Workshop on Parallel and Distributed Databases (PaDD), pp 519-523, September 2003.

Keywords:  
Category: In Workshop

BibTeX

@misc{El-Hajj+Zaiane:PaDD03,
  author = {Mohammad El-Hajj and Osmar R. Zaiane},
  title = {Parallel Association Rule Mining with Minimum Inter-Processor
    Communication},
  Pages = {519-523},
  booktitle = {International Workshop on Parallel and Distributed Databases
    (PaDD)},
  year = 2003,
}

Last Updated: February 04, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo