Khalid, Naveed

Searching for fit to IRT models in complex data


Naveed Khalid PhD

Project at: Department of Research Methodology, Measurement, and Data Analysis (OMD), Twente University, The Netherlands

Project financed by: Twente University

Supervisors: Prof. dr C.A.W. Glas, dr ir B.P. Veldkamp

Project running from: 1 May 2006 – 1 May 2010

Item response theory (IRT) models are used to describe response behavior on psychological tests, educational assessments, and various other measurement situations in the social sciences. However, the inferences made using IRT are only valid as far as the model fits the data. Therefore, statistical tests have been developed to assess model fit. Though these tests are informative with respect to specific model violations and the test statistics on which they are based are firmly rooted in asymptotic theory, in many applications they have some serious problems. Among these problems are (1) the problem of huge power in large samples, (2) the fact that they do not directly reveal the impact of the model violation for the envisioned application, and (3) the fact that they lose their validity when the model is grossly violated. The present research project aims at tackling these problems. The approach has the following elements.

  1. A model is searched for either with a top-down procedure or a bottom-up procedure. In the topdown procedure, the analyses start with a relatively simple model (say the 1pl, 2pl or 3pl) and refine this model to a multidimensional IRT model using test statistics and other indices of model fit. In the bottom-up procedure, the analyses start with observable item statistics, such as means and covariances, and select items for subscales in a multidimensional IRT model by optimizing IRT test statistics and fit indices. The focus of the project will mainly be on top-down procedures; bottom-up procedures will be studied in the secondary stage of the project.
  2. The search for a fitting model will not only be driven by test statistics, but also by indices that give an indication of the impact of the misfit on the envision application. Two applications will be targeted: making pass-fail decisions and evaluation of a structural model on the person parameters. The idea behind this approach is that no model will fit the data perfectly, but the consequences of misfit to the inferences made using the assessment should be indexed and (if possible) minimized.

Date of defence: 9 December 2009

Title of thesis: IRT model fit from different perspectives