Vereniging Hogescholen, project 10voordeleraar &
Department of Research Methodology, Measurement and Data Analysis
Faculty of Behavioural Sciences
University of Twente
Setting standards for exams is an ongoing process involving different stakeholders. The term ‘standard’ in the current thesis uses its theoretical meaning as an agreement about a minimum requirement to pass an exam in terms of knowledge, proficiency, ability, and aptitudes. When this theoretical standard has been set on an exam test score scale, it becomes ‘a performance standard,’ a synonym for a cut score or a minimum passing score on a given exam form.
Various manuals are available that discuss the process of standard setting for exams and critical reviews discussing standard setting methods. Even so, too little research is available for maintaining a standard on a new exam form in a small sample context. The literature to maintain equivalent scores across different exam forms is mainly based on large samples and Item Response Theory (IRT). But, in practice, for example, in teacher training programs, it is often impossible to have large samples and to collect more data might take a few years. That is impossible because it would mean a postponement of grading until enough data are collected.
Chapter 3 therefore compared IRT equating with an equating method from classical test theory with a simulation study. As the results were in favor of IRT equating, was the following chapter 4 designed to theoretically express the observed variability of the estimated cut score on the second exam form. This chapter was the most challenging part of the current thesis, and further research is needed on this. Chapter 5 evaluated the standard setting process in which the standard was set on each exam form
separately, using the Angoff or Cohen methods. The results demonstrated an unfair cut score when different panels estimated the cut score on each exam form separately and when the cut score was estimated separately with the Cohen method. To eliminate the criticism that the panel’s composition might have caused this result, an experiment was conducted while keeping the composition of the expert panel constant. The final chapter of the thesis concerned feedback of the results to different stakeholders. The interpretation of an examinees results is one of the essential parts of the examination
process. It is necessary to identify the strengths and weaknesses of the examinee and identify the strengths and weaknesses of the teacher-training programmes from different parts of the exam. This thesis ends with practical advice for 10voordeleraar about setting standards for the teacher-training exams.
Prof. T.J.H.M. Eggen & Dr N.D. Verhelst
Vereniging Hogescholen, project 10voordeleraar
1 November 2016 – 2 June 2022