Nonlinear modeling with high volume data sets from systems biology
Ralph Rippe (PhD student)
Methodology and Statistics Unit
Department of Psychology, Faculty of Social and Behavioral Sciences, Leiden University
Project: project at Leiden University
Project running from: 1 June 2006 – 1 June 2011
Promotores: Prof. dr J.J. Meulman, prof. dr. ing. P.H.C. Eilers
Summary:
Prediction problems are typically regression problems and supervised classification problems, in which the development of the prediction procedures and their validation go hand-in-hand. Prediction problems are nonlinear when categorical (ordinal or nominal) variables are involved, possibly with numerical variables as well.
Large data sets generally come into two forms: either the number of variables is very large compared to the number of observations (wide data sets), or the number of observations is extremely large (long data sets).
The current proposal will develop, extend and apply methodology to deal with both forms of large data sets, in a direction which is especially applicable to categorical data through the use of nonlinear transformations. This approach is firmly based in the data analytic and algorithmic tradition of the Data Theory Group at the Faculty of Social and Behavioral Sciences at Leiden University.
Date of defence: 13 November 2012
Title of thesis: Advanced statistical tools for SNP arrays. ISBN/EAN: 978-94-90858-14-8