Not Logged In

Finding Effective Ways to (Machine) Learn fMRI-based Classifiers from Multi-Site Data

Full Text: finding-effective-ways.pdf PDF

Machine learning techniques often require many training instances to find useful patterns, especially when the signal is subtle in high-dimensional data. This is especially true when seeking classifiers of psychiatric disorders, from fMRI (functional magnetic resonance imaging) data. Given the relatively small number of instances available at any single site, many projects try to use data from multiple sites. However, forming a dataset by simply concatenating the data from the various sites, often fails, due to batch effects -- that is, the accuracy of a classifier learned from such a multi-site datasets, is often worse than of a classifier learned from a single site. We show why several simple, commonly used, techniques -- such as including the site as a covariate, z-score normalization, or whitening -- are useful only in very restrictive cases. Additionally, we propose an evaluation methodology to measure the impact of batch effects in classification studies and propose a technique for solving batch effects under the assumption that they are caused by a linear transformation. We empirically show that this approach consistently improve the performance of classifiers in multi-site scenarios, and presents more stability than the other approaches analyzed.

Citation

R. Vega, R. Greiner. "Finding Effective Ways to (Machine) Learn fMRI-based Classifiers from Multi-Site Data". Machine Learning in Clinical Neuroimaging (MLCN), Springer, Cham, (ed: Stoyanov D. et al.), 11038, pp 32-39, October 2018.

Keywords: multi-site fMRI, batch effects, machine learning
Category: In Workshop
Web Links: DOI:
  Springer

BibTeX

@misc{Vega+Greiner:MLCN18,
  author = {Roberto Vega and Russ Greiner},
  title = {Finding Effective Ways to (Machine) Learn fMRI-based Classifiers
    from  Multi-Site Data},
  Booktitle = {MLCN 2018, DLF 2018, IMIMIC 2018. Lecture Notes in Computer
    ScienceLecture Notes in Computer Science},
  Publisher = {Springer, Cham},
  Editor = {Stoyanov D. et al.},
  Volume = "11038",
  Pages = {32-39},
  booktitle = {Machine Learning in Clinical Neuroimaging (MLCN)},
  year = 2018,
}

Last Updated: February 11, 2020
Submitted by Sabina P

University of Alberta Logo AICML Logo