Not Logged In

Learning Accurate Regressors for Predicting Survival Times of Individual Cancer PatientsTS

Full Text: Hsiu-Chin_Lin.pdf PDF

Survival prediction is the task of predicting the length of time that an individual patient will survive; accurate predictions can give doctors better guidelines on selecting treatments and planning futures. This differs from the standard survival analysis, which focuses on population-based studies and tries to discover the prognostic factors and/or analyze the median survival times of different groups of patients.

The objective of our work, survival prediction, is different: to find the most accurate model for predicting the survival times for each individual patient. We view this as a regression problem, where we try to map the features for each patient to his/her survival time. As the relationship between features and survival time is still not understood, we consider various ways to learn these models from historical patient records. This is challenging in medical/clinical data due to the presence of irrelevant features, outliers, and missing class labels. This dissertation describes our approach for overcoming these, and other challenges, producing techniques that can predict survival times.

We focus our experiments on a data set of 2402 patients, including 1260 censored patients (i.e., whose survival time is not known). Our approach consists of two major steps. In the first step, we apply various grouping methods to divide the data set into smaller populations. In the second step, we apply different regression models to each sub-group we obtained from the first step. Our experiments show that the linear regression, the support vector regression, and the gating regression are effective: each predictor can obtain an average cross validated relative absolute error lower than 0.54 (where the average relative absolute error of a regressor is E[ |t-€p| / p ] where t is the true survival time and p is our prediction for each patient). We also use our regressors to classify each patient into "œlong survivor" versus "œshort survivor" where the classification boundary is the median survival time of the entire population; here, we show that several regressors can achieve at least 70% accuracy. These experimental results verify that we can effectively predict patient™ survival times with a combination of statistical and machine learning approaches.

Citation

H. Lin. "Learning Accurate Regressors for Predicting Survival Times of Individual Cancer PatientsTS". MSc Thesis, University of Alberta, October 2010.

Keywords: Survival Prediction, Medical Informatics, Machine Learning
Category: MSc Thesis

BibTeX

@mastersthesis{Lin:10,
  author = {Hsiu-Chin Lin},
  title = {Learning Accurate Regressors for Predicting Survival Times of
    Individual Cancer PatientsTS},
  School = {University of Alberta},
  year = 2010,
}

Last Updated: November 24, 2010
Submitted by Russ Greiner

University of Alberta Logo AICML Logo