Not Logged In

A Disease Classifier for Metabolic Profiles Based on Metabolic Pathway Knowledge

Full Text: Eastman_Thomas_Spring+2010.pdf PDF

This thesis presents Pathway Informed Analysis (PIA), a classification method for predicting disease states (diagnosis) from metabolic profile measurements that incorporates biological knowledge in the form of metabolic pathways. A metabolic pathway describes a set of chemical reactions that perform a specific biological function. A significant amount of biological knowledge produced by efforts to identify and understand these pathways is formalized in readily accessible databases such as the Kyoto Encyclopedia of Genes and Genomes. PIA uses metabolic pathways to identify relationships among the metabolite concentrations that are measured by a metabolic profile. Specifically, PIA assumes that the class-conditional metabolite concentrations (diseased vs. healthy, respectively) follow multivariate normal distributions. It further assumes that conditional independence statements about these distributions derived from the pathways relate the concentrations of the metabolites to each other. The two assumptions allow for a natural representation of the class-conditional distributions using a type of probabilistic graphical model called a Gaussian Markov Random Field. PIA efficiently estimates the parameters defining these distributions from example patients to produce a classifier. It classifies an undiagnosed patient by evaluating both models to determine the most probable class given their metabolic profile. We apply PIA to a data set of cancer patients to diagnose those with a muscle wasting disease called cachexia. Standard machine learning algorithms such as Naive Bayes, Tree-augmented Naive Bayes, Support Vector Machines and C4.5 are used to evaluate the performance of PIA. The overall classification accuracy of PIA is better than these algorithms on this data set but the difference is not statistically significant. We also apply PIA to several other classification tasks. Some involve predicting various manipulations of the metabolic processes performed in experiments with worms. Other tasks are to classify pigs according to properties of their dietary intake. The accuracy of PIA at these tasks is not significantly better than the standard algorithms.

Citation

T. Eastman. "A Disease Classifier for Metabolic Profiles Based on Metabolic Pathway Knowledge". MSc Thesis, University of Alberta, February 2010.

Keywords: machine learning, metabolic profile, metabolic pathway, graphical model, cachexia, bioinformatics
Category: MSc Thesis

BibTeX

@mastersthesis{Eastman:10,
  author = {Thomas Eastman},
  title = {A Disease Classifier for Metabolic Profiles Based on Metabolic
    Pathway Knowledge},
  School = {University of Alberta},
  year = 2010,
}

Last Updated: March 11, 2010
Submitted by Russ Greiner

University of Alberta Logo AICML Logo