Rosember Guerra Urzola

rosemberSchool of Social and Behavioral Sciences
Methodology and Statistics
Tilburg University

Personal webpage Rosember Guerra Urzola

Project

A huge scale optimization approach to joint data modeling in the social and behavioral sciences
Almost any aspect of human behavior has become measurable and many of us leave clear digital footprints when blogging, posting tweets and pictures, connecting with others through the social media, being tracking in time and space, having media records with a full DNA-scan. Consequently, social science research has moved from a data-poor discipline to a data-rich one and survey data of groups. To give an example, in study of obesity as the result of the interplay between genetic constitution and environmental factors, socio demographic, health related and questionnaire data are used  (Boyd et al., 2013). The aim of such integration approaches is to generate knowledge about the common driving mechanisms beneath each of the sources.  Multivariate methods like principal component analysis and partial least squares became a standard in bioinformatics, the pioneering discipline when it comes to the use of large-scale data. Recently, these have been extended with variable selection (Witten, Tibshirani, & Hastie, 2009), and combinations thereof (Gu & Van Deun, 2016). From a modeling point of view, these methods are attractive, and they work well with data consisting of a limited of variables. Yet, in case of large number of variables, computational short cuts are used to address the issue of computational time (Friedman, Hastie, & Tibshirani, 2010) and there is not one, but many (near) optimal solutions. Hence, biased and non-unique solution are obtained. The aim of this project is to develop a novel statistical and computational tool that tackle to solve the stability issue generate by the existence of multiples solutions. Moreover, to address the problem of biased solutions, an optimization algorithm that dynamically choose the variable to be considered will be proposed. Finally, we will apply the methods to real social and behavioral science problems.

Supervisors
Prof. dr. K. Sijtsma, dr. K.  van Deun, dr. J.C. Vera Lizcano

Financed by
Data Science Tilburg University

Period
1 September 2018 – 1 September 2022