Null hypothesis significance testing: A review of an old and continuing controversy

Raymond S. Nickerson et al.

Psychological Methods, vol. 5, no. 2, 2000, pp. 241–301

Abstract

Null hypothesis significance testing (NHST) constitutes the most prevalent approach to hypothesis evaluation in the behavioral and social sciences, yet it remains the subject of intense, long-standing controversy. Critical analysis reveals that many researchers harbor fundamental misconceptions regarding the logic of NHST, frequently confusing the probability of data given a null hypothesis with the posterior probability of the hypothesis itself. Other primary objections involve the sensitivity of $p$-values to sample size, the arbitrary dichotomization of results into “significant” and “nonsignificant” categories, and the fact that the “nil” null hypothesis of zero effect is often known to be false a priori. Conversely, supporters argue that NHST provides a necessary, standardized mechanism for ruling out chance as a plausible explanation for observed data, particularly when testing ordinal predictions. To mitigate the limitations of traditional testing, the integration of effect size estimates, confidence intervals, and power analyses is recommended. While NHST does not establish the truth of a theory or the practical importance of a finding, it remains an effective aid to data interpretation when applied with methodological rigor and supplemented by alternative statistical frameworks such as Bayesian inference and meta-analysis. – AI-generated abstract.

Null hypothesis significance testing: A review of an old and continuing controversy

Abstract

PDF