Computerized adaptive text-based testing in psychological and educational measurement
Qiwei He
OMD / Toegepaste Onderwijskunde, Twente University
Project financed by: Stichting Achmea Slachtofferhulp Samenleving
Project running from: 1 February 2009 – 1 June 2013
Supervisors:
– prof. dr C.A.W. Glas (Twente University)
– prof. dr ir Th. De Vries (Twente University)
Summary of project
Computerized adaptive testing (CAT, Wainer et al., 1990, van der Linden & Glas, 2002, 2010 (in Press)) has become increasingly popular during the past decade in both educational and psychological measurement. The flexibility of CAT combined with the possibilities of internet-based testing seems profitable for many operational testing programs (Bartram & Hambleton, 2006).
In CAT, the items are adapted to the level of the respondent, that is, the difficulty of the items is adapted to the estimated level of the respondent. If the performance on previous items has been rather weak, an easy item will be presented next, and if the performance on previous items has been rather strong, a more difficult item will be selected for administration. The main advantage of this approach is that the test length can be reduced considerably without loosing measurement precision. Besides, the respondents are administered items at their specific ability level, which implies that they won’t get bored by to easy items or frustrated by too difficult ones.
The measurement framework underlying CAT comes from Item Response Theory (IRT). One of the key features of IRT is that both item and person parameters are distinguished in the measurement model. For dichotomously scored items, the probability of a correct or positive response depends on person parameters such as the ability level of the person and on item parameters such as the difficulty-, discrimination- and pseudo-guessing parameter. For a thorough introduction to IRT, one is referred to Hambleton and Swaminathan (1985) or Embretson and Reise (1991).
In this PhD project, the focus is on open answer questions where more complicated automated scoring algorithms have to be developed. Applications are either within the context of psychological or educational measurement.
The technology of CAT has been developed for multiple-choice items in the cognitive domain that are dichotomously or polytomously scored. For these items, both the correct and the incorrect answers are precisely defined and automated scoring can be implemented on the fly. For other item types, application of CAT is less straightforward.
For example for open-answer questions, automated scoring rules can be much more complicated. Further, CAT is more and more applied outside the traditional cognitive domain. Initially, the present project will focus on the assessment of post traumatic stress disorder (PTSD).
Date of defence: 3 October 2013
Title of thesis: Text mining and IRT for psychiatric and psychological assessment