Not Logged In

Parallel Bifold: Large-Scale Parallel Pattern Mining with Constraints

Full Text: dpd06.pdf PDF

When computationally feasible, mining huge databases produces tremendously large numbers of frequent patterns. In many cases, it is impractical to mine those datasets due to their sheer size; not only the extent of the existing patterns, but mainly the magnitude of the search space. Many approaches have suggested the use of constraints to apply to the patterns or searching for frequent patterns in parallel. So far, those approaches are still not genuinely effective to mine extremely large datasets. We propose a method that combines both strategies efficiently, i.e. mining in parallel for the set of patterns while pushing constraints. Using this approach we could mine significantly large datasets; with sizes never reported in the literature before. We are able to effectively discover frequent patterns in a database made of billion transactions using a 32 processors cluster in less than 2 hours.

Citation

M. El-Hajj, O. Zaiane. " Parallel Bifold: Large-Scale Parallel Pattern Mining with Constraints". Distributed and Parallel Databases, An International Journal, 20(3), pp 225-243, October 2006.

Keywords:  
Category: In Journal
Web Links: Webdocs

BibTeX

@article{El-Hajj+Zaiane:06,
  author = {Mohammad El-Hajj and Osmar R. Zaiane},
  title = { Parallel Bifold: Large-Scale Parallel Pattern Mining with
    Constraints},
  Volume = "20",
  Number = "3",
  Pages = {225-243},
  journal = {Distributed and Parallel Databases, An International Journal},
  year = 2006,
}

Last Updated: October 31, 2019
Submitted by Sabina P

University of Alberta Logo AICML Logo