Not Logged In

Contrasting Sequence Groups by Emerging Sequences

Full Text: DS09.pdf PDF

Group comparison per se is a fundamental task in many scientific endeavours but is also the basis of any classifier. Contrast sets and emerging patterns contrast between groups of categorical data. Comparing groups of sequence data is a relevant task in many applications. We define Emerging Sequences (ESs) as subsequences that are frequent in sequences of one group and less frequent in the sequences of another, and thus distinguishing or contrasting sequences of different classes. There are two challenges to distinguish sequence classes: the extraction of ESs is not trivially efficient and only exact matches of sequences are considered. In our work we address those problems by a suffix tree-based framework and a sliding window matching mechanism for the distance metric between sequences. We propose a classifier for sequence data based on Emerging Sequences. Evaluating against two learning algorithms based on frequent subsequences and exact matching subsequences, the experiments on two datasets show that our similar ESs-based classification model outperforms the baseline approaches by up to 20% in prediction accuracy.

Citation

K. Deng, O. Zaiane. "Contrasting Sequence Groups by Emerging Sequences". Discovery Science, Porto, Portugal, pp 377-384, October 2009.

Keywords: Emerging Sequences, Classification, Sequence Similarity
Category: In Conference

BibTeX

@incollection{Deng+Zaiane:09,
  author = {Kang Deng and Osmar R. Zaiane},
  title = {Contrasting Sequence Groups by Emerging Sequences},
  Pages = {377-384},
  booktitle = {Discovery Science},
  year = 2009,
}

Last Updated: January 15, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo