Methodology and Statistics
Institute of Psychology
Faculty of Social and Behavioural Sciences, Leiden University
Phone: +31 71 527 7330
Personal webpage Wouter van Loon
Stacked Domain Learning for multi-domain data: A new ensemble method
In health research, more and more often data are collected from different domains such as questionnaires, structural MRI, functional MRI, EEG, genetics, metabolomics, etc. These different domains of data may be further divided into sub-domains. For example, many different sets of features can be computed from fMRI data alone. Combining data from multiple domains may lead to increased accuracy in the early diagnosis of e.g. Alzheimer’s disease. Furthermore, identification of important domains can lead to simpler, more interpretable diagnostic models.
Currently, most multi-domain data is analyzed through concatenation: simply putting the features from all domains into one large matrix, and fitting a single model on the complete data. We propose an alternative called Stacked Domain Learning, which works by training a model on each domain separately and then using a meta-learner to optimally combine the predictions of the domain-specific models.
Stacked Domain Learning is a highly flexible method. We will study different configurations of the method and compare their performance with existing methods using both simulations and real data examples. Additionally, we will investigate how to deal with intrinsic differences (e.g. measurement error) between the domains, and how to discover cross-domain interactions. The developed methods will be shared in the form of R packages.
Prof. dr. M.J. de Rooij, Dr. M. Fokkema, Dr. E.M.L. Dusseldorp, Dr. B.T. Szabo
Leiden University – Department of Methodology and Statistics/Leiden Centre of Data Science
15 May 2017-14 May 2021