Reporting and interpreting non-significant results in animal cognition research
How statistically non-significant results are reported and interpreted following null hypothesis significance testing is often criticized. This issue is important for animal cognition research because studies in the field are often underpowered to detect theoretically meaningful effect sizes, i.e., often produce non-significant p-values even when the null hypothesis is incorrect. Thus, we manually extracted and classified how researchers report and interpret non-significant p-values and examined the p-value distribution of these non-significant results across published articles in animal cognition and related fields. We found a large amount of heterogeneity in how researchers report statistically non-significant p-values in the result sections of articles, and how they interpret them in the titles and abstracts. Reporting of the non-significant results as “No Effect” was common in the titles (84%), abstracts (64%), and results sections (41%) of papers, whereas reporting of the results as “Non-Significant” was less common in the titles (0%) and abstracts (26%), but was present in the results (52%). Discussions of effect sizes were rare (<5% of articles). A p-value distribution analysis was consistent with research being performed with low power of statistical tests to detect effect sizes of interest. These findings suggest that researchers in animal cognition should pay close attention to the evidence used to support claims of absence of effects in the literature, and—in their own work—report statistically non-significant results clearly and formally correct, as well as use more formal methods of assessing evidence against theoretical predictions.