Predicting Subcellular Localization of Proteins using Machine-Learned Classifiers
- Zhiyong Lu
- Duane Szafron, UofA CS
- Russ Greiner, Dept of Computing Science; PI of AICML
- Paul Lu, Department of Computing Science
- David S. Wishart, Departments of Computing Science and Biology, University of Alberta
- Brett Poulin, Computing Science
- John Anvik
- Cam Macdonell, Computing Science
- Roman Eisner
Motivation: Identifying the destination or localization of proteins is key to understanding their function and facilitating their purification. A number of existing computational prediction methods are based on sequence analysis. However, these methods are limited in scope, accuracy and most particularly breadth of coverage. Rather than using sequence information alone, we have explored the use of database text annotations from homologs and machine learning to improve substantially the prediction of subcellular location.
Results: We have constructed five machine-learning classifiers for predicting subcellular localization of proteins from animals, plants, fungi, Gram-negative bacteria and Grampositive bacteria, which are 81% accurate for fungi and 92 - 94% accurate for the other four categories. These are the most accurate subcellular predictors across the widest set of organisms ever published. Our predictors are part of the Proteome Analyst web-service.
Citation
Z. Lu, D. Szafron, R. Greiner, P. Lu, D. Wishart, B. Poulin, J. Anvik, C. Macdonell, R. Eisner. "Predicting Subcellular Localization of Proteins using Machine-Learned Classifiers". Bioinformatics, 20(4), pp 547--556, March 2004.Keywords: | Proteome Analyst, subcellular, machine learning, bioinformatics, medical informatics |
Category: | In Journal |
BibTeX
@article{Lu+al:Bioinformatics04, author = {Zhiyong Lu and Duane Szafron and Russ Greiner and Paul Lu and David S. Wishart and Brett Poulin and John Anvik and Cam Macdonell and Roman Eisner}, title = {Predicting Subcellular Localization of Proteins using Machine-Learned Classifiers}, Volume = "20", Number = "4", Pages = {547--556}, journal = {Bioinformatics}, year = 2004, }Last Updated: April 28, 2012
Submitted by Russ Greiner