Abstract
When analyzing data, researchers make certain decisions that can be arbitrary, based on subjective beliefs about the data generating process, or for which equally justifiable alternatives exist. This wide array of data-analytic choices can be prone to abuse and has been identified as one of the underlying causes of the replication crisis in several fields.
Recently, the introduction of multiverse analysis has provided a method to assess the stability of results across various data specifications (e.g., pre-processing). Subsequently, specification curve analysis has added an inferential procedure, allowing to study, in different specifications, the null hypothesis that a predictor of interest is not associated with the outcome. However, the approach is limited to simple cases related to the linear model. Moreover, it only infers whether at least one specification rejects the null hypothesis, without indicating which specifications should be selected.
We propose the Post-selection Inference approach to Multiverse Analysis (PIMA), a flexible and general inferential approach to multiverse analysis based on a conditional resampling procedure. It allows considering a wide range of data specifications and any generalized linear model. Researchers can test a null hypothesis in different specifications with strong control of the familywise error rate. Since the method accounts for multiplicity, they are able to freely choose the preferred specification post hoc, after trying all reasonable choices and seeing results
Organizzazione
Rossella Miglio