Prof. J.M. Wicherts, Dr M.A.L.M. van Assen & Prof. J.K. Vermunt
On November 8th, 2017, Coosje Veldkamp defended her thesis entitled
Recent studies have highlighted that not all published findings in the scientific literature are trustworthy, suggesting that currently implemented control mechanisms such as high standards for the reporting of research methods and results, peer review, and replication, are not sufficient. In psychology in particular, solutions are sought to deal with poor reproducibility and replicability of research results. In this dis- sertation project I considered these problems from the perspective that the scientific enterprise must better recognize the human fallibility of scientists, and I examined potential solutions aimed at dealing with human error and bias in psychological science.
First, I studied whether the human fallibility of scientists is actually recognized (Chapter 2). I examined the degree to which scientists and lay people believe in the storybook image of the scientist: the image that scientists are more objective, rational, open-minded, intelligent, honest and communal than other human beings. The results suggested that belief in this storybook image is strong, particularly among scientists themselves. In addition, I found indications that scientists believe that scientists like themselves fit the storybook image better than other scientists. I consider scientist’s lack of acknowledgement of their own fallibility problematic, because I believe that critical self-reflection is the first line of defense against potential human error aggravated by confirmation bias, hindsight bias, motivated reasoning, and other human cognitive biases that could affect any professional in their work.
Then I zoomed in on psychological science and focused on human error in the use of null the most widely used statistical framework in psychology: hypothesis significance testing (NHST). In Chapters 3 and 4, I examined the prevalence of errors in the reporting of statistical results in published articles, and evaluated a potential best practice to reduce such errors: the so called ‘co-pilot model of statistical analysis’. This model entails a simple code of conduct prescribing that statistical analyses are always conducted independently by at least two persons (typically co-authors). Using statcheck, a software package that is able to quickly retrieve and check statistical results in large sets of published articles, I replicated the alarmingly high error rates found in earlier studies. Although I did not find support for the effectiveness of the co-pilot model in reducing these errors, I proposed several ways to deal with human error in (psychological) research and suggested how the effectiveness of the proposed practices might be studied in future research.
Finally, I turned to the risk of bias in psychological science. Psychological data can often be analyzed in many different ways. The often arbitrary choices that researchers face in analyzing their data are called researcher degrees of freedom. Researchers might be tempted to use these researcher degrees of freedom in an opportunistic manner in their pursuit of statistical significance (often called p- hacking). This is problematic because it renders research results unreliable. In Chapter 5 I presented a list of researcher degrees of freedom in psychological studies, focusing on the use of NHST. This list can be used to assess the potential for bias in psychological studies, it can be used in research methods education, and it can be used to examine the effectiveness of a potential solution to restrict opportunistic use of RDFs: study pre-registration.
Pre-registration requires researchers to stipulate in advance the research hypothesis, data collection plan, data analyses, and what will be reported in the paper. Different forms of pre-registration are currently emerging in psychology, mainly varying in terms of the level of detail with respect to the research plan they require researchers to provide. In Chapter 6, I assessed the extent to which current pre-registrations restricted opportunistic use of the researcher degrees of freedom on the list presented in Chapter 5. We found that most pre-registrations were not sufficiently restrictive, but that those that were written following better guidelines and requirements restricted opportunistic use of researcher degrees of freedom considerably better than basic pre-registrations that were written following a limited set of guidelines and requirements. We concluded that better instructions, specific questions, and stricter requirements are necessary in order for pre-registrations to do what they are supposed to do: to protect researchers from their own biases.
Human factors in statistics
Inferential statistics play a key role in many sciences. Although the normative workings of these statistical tools are well established, surprisingly little is known about how researchers use them in practice, how often they make mistakes therein, and whether their expectations affect their (reported) statistical results. Recent results highlight a high prevalence of errors in the reporting of statistical results in peer-reviewed journals and show that these errors are predominantly in favor of the researcher’s hypothesis. We argue that human factors in statistics are a potential source of bias in the (reported) outcomes of scientific studies.
In this project, we study how human factors affect the accuracy of reported statistical results in the scientific literature, and to what extent scientists differ from non-scientists with respect to human fallibility. Taking a social-psychological as well as a methodological perspective, we aim to learn more about the psychology of the use of statistics.