Not Logged In

A Machine Learned Classifier that uses Gene Expression Data to Accurately Predict Estrogen Receptor Status

Full Text: journal.pone.0082144.PDF PDF
Other Attachments: Breast_cancer_algorithm_clip_book_Summary_PS.pdf [Auxiliary Material] PDF

Purpose: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. Methods: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. Results: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93%. When applied to an independent validation set and to four other public databases, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. Conclusions: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.

Citation

M. Bastani, L. Vos, N. Asgarian, J. Deschenes, K. Graham, J. Mackey, R. Greiner. "A Machine Learned Classifier that uses Gene Expression Data to Accurately Predict Estrogen Receptor Status". PLoS One, 8(12), pp e82144, November 2013.

Keywords: machine learning, medical informatics, bioinformatics, ER status
Category: In Journal
Web Links: DOI
  Journal URL

BibTeX

@article{Bastani+al:PLoSONE13,
  author = {Meysam Bastani and Larissa Vos and Nasimeh Asgarian and Jean
    Deschenes and Kathryn Graham and John Mackey and Russ Greiner},
  title = {A Machine Learned Classifier that uses Gene Expression Data to
    Accurately Predict Estrogen Receptor Status},
  Volume = "8",
  Number = "12",
  Pages = {e82144},
  journal = {PLoS One},
  year = 2013,
}

Last Updated: February 10, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo