Felix Clouth

Methodology and Statisticsphoto
Tilburg School of Social and Behavioral Sciences
Tilburg University

Academic webpage Felix Clouth


Personalised Treatment Options Model

The aim of this project is to develop Personalized Option Profiles for patients diagnosed with cancer. With Personalized Option Profiles we mean that we will build a tool that aims to help patients choosing a treatment that maximizes the probability of a certain, personalized to that patient, outcome profile. Using the “NCR” (National Cancer Registry) and the “PROFILES” (PROFILES study, Tilburg University) data set, patients will be clustered based on their outcomes using latent class analysis. As a first step, latent classes will be constructed using indicators such as survival time, reoccurrence of the disease, and quality of life after treatment. Decisions on the number of classes in our model will be guided by statistical information criterions (i.e. AIC and BIC) but eventually will be based on theoretical considerations and medical interpretability. For this, we will aim for a close collaboration with medical decision makers (i.e. oncologists). Having identified a model with an optimal number of classes, in a second step, the patients in our data set will be assigned to the class with their highest probability (classification). These two first steps can be conducted using the specialized software Latent GOLD or R.

In a third step, predictions will be made for new patients. More precisely, based on their tumor characteristics, general health, age, sex, etc. a prediction model will be constructed that estimates which latent outcome class is most likely for this new patient. Additional, in this step we will control for the treatment effect. That is, for each relevant treatment option for that specific patient we will estimate the effect on the likelihood of belonging to one of the latent outcome classes. Doing so, we will be able to give recommendations to each new patient for their best treatment depending on their outcome preferences. This prediction step can be a range of different models, from a multinomial logistic regression to statistical learning (i.e. black box) models, and to this point we did not yet decide which exact model will be used. This last step will be performed using R. This analysis strategy also referred to as the three-step approach [1] allows for analyzing big data sets in a feasible manner and is particularly suited for our needs.

The data is available and will be provided by the Intergraal Kankercentrum Nederland (IKNL) where I have an appointment as an external researcher. Further, the prediction model will be developed in close collaboration with another PhD student who works on Natural Language Generation and will build a tool to communicate the results from this project in a personalized way.

The approach taken in this project is as follows. To begin, we will explore the different outcome variables available in the two datasets and build an outcome profiles sub-model for colon cancer [M1-M6] and a personalized option profiles model using the three-step implementation of statistical learning methods [M7-M12]. This work will result in the first paper that we aim to publish in a journal for oncology or medical statistics.

Subsequently, we generalize these clustering and prediction tools to be applicable with other cancer types [M13-M18] and to allow flexible updating when new data come in [M19-M24] (second paper). During these two stages also a first version of a R package will be created. Next, the general toolbox will be applied to and tested with prostate cancer data [M25-M30]. Based on the experiences with the implementation of the personalized decision aids in subproject 2, we will finetune our models and software implementation [M31-M36] (third paper). Finally, the models and tools will be evaluated, with both patients and doctors [M37-M42] (fourth paper).      [1] https://jeroenvermunt.nl/lca_three_step.pdf

Prof. dr. Jeroen Vermunt, prof. dr. Steffen Pauws

Financed by
NWO (project: DATA2PERSON)

Project period
1 September 2018 – 1 September 2022