AI safety can be a Pascal's mugging even if p(doom) is high

Elliott Thornley

Effective Altruism Forum, April 24, 2026

Abstract

The classification of AI safety as a “Pascal’s mugging” often relies on a fundamental misunderstanding of the distinction between baseline risk and marginal impact. In decision-theoretic terms, the relevance of a potential intervention is determined not by the absolute probability of a catastrophic outcome—often cited as “p(doom)"—but by the marginal probability that a specific action will successfully avert that outcome. Arguments asserting that high baseline risk alone exempts AI safety from the Pascal’s mugging critique are logically insufficient, as a high-probability disaster can still involve an infinitesimally small chance of individual influence. However, the characterization of AI safety as a mugging is undermined by the non-negligible probability of individual efficacy in the current sociotechnical landscape. Unlike hyperbolic scenarios involving astronomical stakes and near-zero probabilities, the field of AI development is currently fluid and concentrated. The relative proximity of individuals to key institutional decision-makers, including laboratory executives and policymakers, suggests that the probability of exerting a pivotal influence is substantial. Because the chance of a single actor making a difference is statistically significant rather than “bajillion-to-one,” efforts to mitigate AI risk do not meet the criteria for a Pascal’s mugging, but instead represent high-expected-value interventions grounded in plausible causal chains. – AI-generated abstract.

AI safety can be a Pascal's mugging even if p(doom) is high

Abstract

PDF