**Project***Moderator Analysis in Meta-Analysis*

The number of research publications is growing exponentially with a growth rate of about 4% leading to a massive amount of research findings each year (Bornmann et al., 2021). Recently, the COVID-19 pandemic has vastly accelerated this growth of the scientific literature and its availability via an increase in funding, a speeding-up of the peer-review and publication process, a lowering in publication standards, and by increasing the number of preprints and open-access options, respectively (Nane et al., 2023). For scientists, this poses a major challenge as it becomes more and more difficult

to digest available, sometimes even conflicting literature findings and to formulate policy or treatment recommendations from it. Instead of reading thirty individual papers that try to answer a certain research question, it is more efficient to read a paper that conducted a meta-analysis that synthesizes the findings of the individual studies. A meta-analysis is a statistical method to calculate a weighted average of the effect sizes stemming from different research papers on the same research questions.

This overall effect size can then be used to determine whether a certain treatment effect remains present and significant when taking the entire available literature into account. This gives more confidence in the existence of a finding and its robustness across slightly different research settings.

The benefit of using meta-analyses lies not only in quantifying average effects across studies, such as treatment effects or correlations, but also in estimating the variation between studies’ effects and finding potential explanations for such heterogeneity (Schmid et al., 2020). A researcher may want to combine studies conducted in the United States and studies conducted in China in their meta-analysis and may observe that the effect sizes of studies coming from different countries seem to be systematically different in size or even direction. This difference or variability is referred to as heterogeneity between studies. Causes of heterogeneity can be explored using meta-regression or mixed-effects meta-regression models, meaning models that include independent variables often referred to as moderators or covariates (Borenstein et al., 2009; Schmid et al., 2020), hereafter only referred to as moderators. Moderators can, for instance, represent differences in study designs, country of origin, dates or participant characteristics. Analyzing moderator effects, to understand why effect sizes between studies differ, is important for generating new hypotheses and

theories as well as for policymaking. For example, a moderator effect may suggest that a treatment is only effective in a certain subgroup of the population, thereby stimulating the search for new treatments for other subgroups and guiding treatment recommendations. As such, correctly conducting moderator analyses in meta-analysis is important for scientificprogress as well as for applied practice.

This PhD project will focus on studying moderator analyses on meta- analysis. The goal hereby lies in identifying shortcomings in the estimation and testing of moderator effects and developing new methods and/or extending existing methods to improve their estimation. Potential shortcomings in moderator analyses include; i) bias, specifically publication bias (Rothstein et al., 2005), ii) missing moderator data (Pigott, 2012), iii) insufficient power to test moderator effects (Hedges & Pigott, 2004), and iv)

selecting a small number of moderator variables from a large set of potential candidates to avoid multiple testing and enable identifying the model (Schmid et al., 2020). Working on these shortcomings during this PhD qualifies as an IOPS project as meta-analyses are common in the field of psychology and are also used to combine studies on psychological constructs (Borenstein et al., 2009).

In the first project, the influence of publication bias on the estimation of moderator effects will be studied. Publication bias occurs when only part of the studies on a certain topic is published. This leads to an unrepresentative sample of studies being included in a meta-analysis and to drawing invalid conclusions from the results (Rothstein et al., 2005). A reason for withholding studies from publication is assumed to relate to the

statistical non-significance of the effect sizes. In common and random effect(s) meta-analysis models this leads to an overestimation of the overall effect size estimate. How publication bias affects moderator estimates in (mixed-effects) meta-regression models is less clear. Additionally, many methods have been developed to correct for publication bias on average effect sizes estimates (e.g., for an overview see Jin et al., 2014; Marks-Anglin & Chen, 2020). Whether they can be used in their original form or whether they need to be adjusted to correct for publication bias affecting the moderator effect requires further research.

When applying (mixed-effects) meta-regression models in practice, researchers may face the problem that not all studies provide information on a moderator. In a set of 64 meta-analyses published in 2016 in the fields of psychology, medicine and education, 45% of the papers mentioned having missing moderator variables (Tipton et al., 2019). Recently, Lee and Beretvas (2022) studied methods for handling missing moderators in meta-regression including complete-case analysis, shifting case analysis, multiple imputation and full information maximum likelihood. Those methods all assume missingness at random (MAR), meaning that the missingness is related to observable but not unobservable variables (Rubin, 1976). That is, if a moderator value would not be observed due to its estimated effect not being statistically significant, MAR would not apply anymore and missing not at random (MNAR) would need to be assumed. The currently available methods for dealing with missing moderator values in meta-analyses can therefore not be applied when the missingness mechanism is not random, as it occurs in the case of publication bias for instance. Studying publication bias on the moderator level is another potential project under this PhD thesis.

Subsequent projects could look at the intertwined issue of low power for testing moderator effects and the uncertainty around which and how many moderators to include in the meta-analysis model. That is, when only a small sample of studies is available but a large number of moderators is thought to explain heterogeneity between effect sizes, i) the power of testing moderator effects may be too low, ii) the model may not be identifiable in extreme cases (i.e. when the number of moderators exceeds the number of studies in the meta-analysis minus other parameters that

need to be estimated such as the residual heterogeneity and the overall effect/intercept), or iii) testing moderators in separate models inflates the type I error and ignores dependencies between moderator variables (Schmid et al., 2020). Using a small number of primary studies in meta-analyses is commonly encountered in medical and psychological research, while slightly less severe in the latter. A sample of 14886 health-care meta-analyses from the Cochrane Database was found to include only a median of 3 studies (IQR = 4; Turner et al., 2015) while a set of 747 meta-analyses published in 61articles in the journal Psychological Bulletin showed a median sample size of 12 studies per meta-analysis (IQR = 28; van Erp et al., 2017). To solve these

issues, recent work has applied classification and regression trees (Li et al., 2018), information-theoretic approaches (Cinar et al., 2021), and LASSO meta-regression (Laurin, 2014) to identify a selection of moderators. The usage of these methods is not yet widespread and recommendations for when to use which method in practice might be needed. This might be another project in this PhD.

**Supervisors**

Dr. R.C.M. van Aert

Prof. Dr. M.A.L.M. van Assen

Prof. Dr. J.M. Wicherts**Financed by**

Department of Methodology and Statistics at Tilburg University**Period**

Ocotber 2022 – Ocotber 2026