Not Logged In

Hierarchical Prediction of Protein Function in The Gene Ontology Using Graphical Models

Full Text: 2008-Theses-PengWang.pdf PDF

High-throughput functional annotation of proteins is a fundamental task in functional proteomics. Protein functions are typically organized in the form of a general- specific hierarchy, such as the Gene Ontology (GO), which describes when one functional class is a specialization of its parent class. The hierarchical structure indicates that if a protein belongs to one class then it also belongs to all ancestor classes up to the root. Most previous work on protein function prediction has constructed independent classifiers for each function, which ignore the hierarchical information available in the GO. We develop a framework for combining the local independent SVM predictions with graphical models, both Bayesian networks (BNs) and Conditional Random Fields (CRFs), which are built upon the hierarchical structure in the GO. Our goal is to increase the overall predictive accuracy by exploiting this hierarchical information. Compared to the baseline technique (i.e. independent SVM classifiers), our techniques using BN and CRF yield significant improvement on two large data sets constructed from the Uniprot database.

Citation

P. Wang. "Hierarchical Prediction of Protein Function in The Gene Ontology Using Graphical Models". MSc Thesis, University of Alberta, April 2008.

Keywords: proteome analyst, machine learning, hierarchy, GO hierarchy
Category: MSc Thesis

BibTeX

@mastersthesis{Wang:08,
  author = {Peng Wang},
  title = {Hierarchical Prediction of Protein Function in The Gene Ontology
    Using Graphical Models},
  School = {University of Alberta},
  year = 2008,
}

Last Updated: April 27, 2012
Submitted by Russ Greiner

University of Alberta Logo AICML Logo