Not Logged In

Using gene clustering to identify discriminatory genes with higher classification accuracy

A single DNA microarray measures thousands to tens of thousands of gene expression levels, but experimental datasets normally consist of much fewer such arrays, typically in tens to hundreds, taken over a selection of tissue samples. The biological interpretation of these data relies on identifying subsets of induced or repressed genes that can be used to discriminate various categories of tissue, to provide experimental evidence for connections between a subset of genes and the tissue pathology. A variety of methods can be used to identify discriminatory gene subsets, which can be ranked by classification accuracy. But the high dimensionality of the gene expression space, coupled with relatively fewer tissue samples, creates the dimensionality problem: gene subsets that are too large to provide convincing evidence for any plausible causal connection between that gene subset and the tissue pathology. We propose a new gene selection method, clustered gene selection (CGS) which, when coupled with existing methods, can identify gene subsets that overcome the dimensionality problem and improve classification accuracy. Experiments on eight real datasets showed that CGS can identify many more cancer related genes and clearly improve classification accuracy, compared with three other non-CGS based gene selection methods.

Citation

Z. Cai, L. Xu, Y. Shi, M. Salavatipour, R. Goebel, G. Lin. "Using gene clustering to identify discriminatory genes with higher classification accuracy". IEEE Symposium on Bioinformatics and Bioengineering(BIBE), pp 235-242, November 2006.

Keywords: machine learning
Category: In Conference

BibTeX

@incollection{Cai+al:BIBE06,
  author = {Z. Cai and Lizhe Xu and Yi Shi and M. Salavatipour and Randy Goebel
    and Guohui Lin},
  title = {Using gene clustering to identify discriminatory genes with higher
    classification accuracy},
  Pages = {235-242},
  booktitle = {IEEE Symposium on Bioinformatics and Bioengineering(BIBE)},
  year = 2006,
}

Last Updated: July 17, 2007
Submitted by Staurt H. Johnson

University of Alberta Logo AICML Logo