Klazien de Vries

Faculty of Behavioural and Social Sciences
Psychometrics and Statistics
University of groningen


What is normal? Accurate norms and their use for psychological tests.

Psychological tests, as intelligence tests, are widely used, like for diagnosis and selection. Test administrations are interpreted using test norms. Currently, norms are less accurate than possible, creating norms requires more effort than necessary, and the interpretation of norms by end users is often hampered by insufficiently available information. We develop advanced norming methods that make (1) scoring tests, (2) designing a normative study, and (3) interpreting test results more precise and simpler than is possible now. This makes the development, maintenance and use of high-quality psychological tests easier and cheaper, which greatly benefits test practice.

My own research will pertain to (2), the optimal design of normative studies.

The norms for each reference population (e.g., defined by age) are calculated using the scores collected in the normative sample. To ensure norms with sufficiently small sampling error, the sample must be of sufficient size and sufficiently representative. For norming, this is particularly difficult, because there are many different reference
populations (e.g., depending on age).

The current practice in drawing a normative sample is to determine sample sizes based on methods (e.g., Bechger et al., 2009; Oosterhuis et al., 2016; Innocenti et al., 2021) that rely upon unrealistic assumptions. These assumptions pertain to both statistics (e.g., linear relationship(s) between age and mean scores, normal score distributions) and sampling scheme (i.e., simple random sampling) (Timmerman et al., 2020). These assumptions are commonly severely violated in typical tests and norming situations. Therefore, normative samples are typically too small to achieve the required accuracy. While this is worrying enough, the issue remains unnoticed in practice: The only source of inaccuracy currently expressed in normed scores is measurement error, while sampling error is completely overlooked (Voncken et al., 2019). This also implies that possible differences in inaccuracies between reference populations are invisible in the norms. This is likely to occur in practice, like larger inaccuracies for ages
near the boundaries of a tests’ age range (Timmerman et al., 2020).

To guarantee norms with sufficient accuracy for all reference populations, we will develop methods to optimally design the normative study, such that the sample size needed is as small as possible for the test to be
normed. Specifically, we will develop methods to draw the normative sample for all continuous norming methods and relevant sampling schemes. This ensures the availability of a method that is built upon realistic assumptions for the normative data at hand.

Their application would result in norms with the desired precision for all reference populations of the test (i.e., not just globally), with the least effort with regard to the collection of an adequately sized sample as is possible. The effort part is important, because normative studies are tedious and expensive. Test constructors will then have easier means to achieve accurate and up-to-date norms, for all their reference populations. This greatly benefits test practice.

Prof. dr. M. E. Timmerman
Prof. Dr. C.J. Albers
Dr. A.F. Ernst

Financed by

January 2023 – january 2027