Interuniversity
Graduate School of
Psychometrics and
Sociometrics
  Search:
  
Hattum, Pascal van

Market segmentation using Bayesian model based clustering

Pascal van Hattum
Department of Methodology and Statistics
Faculty of Social Sciences
Utrecht University
P.O. Box 80140, 3508 TC Utrecht
Phone: +31 30 253 7983 / 4438 (secretary)
E-mail: Pascal Van Hattum

Project: Project at Utrecht University
Project running from: 1 June 2004 – 1 June 2010
Supervisors: Prof. dr H. Hoijtink

Summary of project

Bayesian model based clustering will be developed such that it can be applied to data sets that are specifically used for market segmentation. A number of statistical issues have to be studied before these data sets can be analyzed: – A cluster algorithm that can handle very large data sets will have to be developed.

  • The cluster algorithm has to be able to deal with data that are missing at random.
  • The cluster model has to be a latent mixture of log-linear models that contain main-effects and specific sets of two-way interactions.
  • With large data sets often many clusters are obtained. A specific issue will be transforming the optimal clustering into a more workable (for marketing activities) clustering with a limited number of clusters.
  • The influence of the prior distribution on the marginal likelihood (a quantity that will be used to determine the number of clusters) has to be determined.
  • Obtaining clusters such that these are useful for market segmentation. This requires a cluster model that is restricted to render clusters that have a high probability to contain specific types of persons (in terms of variables that are not used for the clustering).
  • Using the cluster model to predict for new persons to which cluster they belong. This project will build on the work in Hoijtink (1998, 2001) and Hoijtink and Notenboom (2004).

Key words: Bayesian computational statistics, Latent class analysis, Log-linear modeling, Market segmentation, Markov Chain Monte Carlo methods, Missing data, Model based clustering.

Date of defence:
Title of thesis: