Proteome Analyst: Custom Predictions with Explanations in a Web-based Tool for High-Throughput Proteome Annotations
- Duane Szafron, UofA CS
- Paul Lu, Department of Computing Science
- Russ Greiner, Dept of Computing Science; PI of AICML
- David S. Wishart, Departments of Computing Science and Biology, University of Alberta
- Brett Poulin, Computing Science
- Roman Eisner
- Zhiyong Lu
- John Anvik
- David Meeuwis
- Alona Fyshe
- Cam Macdonell, Computing Science
Proteome Analyst (PA) (http://www.cs.ualberta.ca/~bioinfo/PA/) is a publicly-available, high-throughput, Web-based system for predicting various properties of each protein in an entire proteome. Using machinelearned classifiers, PA can predict, for example, the GeneQuiz general function and Gene Ontology (GO) molecular function of a protein. As well, PA is currently the most-accurate and most-comprehensive system for predicting subcellular localization, the location within a cell where a protein performs its main function. Two other capabilities of PA are notable. First, PA can create a custom classifier to predict a new property, without requiring any programming, based on labeled training data (i.e., a set of examples, each with the correct classification label), provided by a user. PA has been used to create custom classifiers for K-ion proteins and other general-function ontologies. Second, PA provides a sophisticated explanation feature that shows why one prediction is chosen over another. The PA system produces a Naive Bayes classifier, which is amenable to a graphical and interactive approach to explanations for its predictions; transparent predictions increase the user s confidence in, and understanding of, PA. hyperlinks to clearly display the evidence for each prediction.
Citation
D. Szafron, P. Lu, R. Greiner, D. Wishart, B. Poulin, R. Eisner, Z. Lu, J. Anvik, D. Meeuwis, A. Fyshe, C. Macdonell. "Proteome Analyst: Custom Predictions with Explanations in a Web-based Tool for High-Throughput Proteome Annotations". Nucleic Acids Research (NAR), 32, pp W365-W371, July 2004.Keywords: | Proteome Analyst, machine learning, naive bayes, bioinformatics, empirical, medical informatics |
Category: | In Journal |
BibTeX
@article{Szafron+al:NAR04, author = {Duane Szafron and Paul Lu and Russ Greiner and David S. Wishart and Brett Poulin and Roman Eisner and Zhiyong Lu and John Anvik and David Meeuwis and Alona Fyshe and Cam Macdonell}, title = {Proteome Analyst: Custom Predictions with Explanations in a Web-based Tool for High-Throughput Proteome Annotations}, Volume = "32", Pages = {W365-W371}, journal = {Nucleic Acids Research (NAR)}, year = 2004, }Last Updated: April 28, 2012
Submitted by Russ Greiner