Review of Joseph Carlsmith, 'Is power-seeking AI an existential risk?'

Eli Lifland

2022

Abstract

Framing existential risk from misaligned, power-seeking artificial intelligence (AI) as a six-step process from timelines to catastrophe allows researchers to forecast the probability of existential risk and identify the key factors that contribute to it. The paper argues that it will be possible and financially feasible to build advanced planning and strategically aware (APS) AI systems by 2070. Further, there will be strong incentives to build them. However, ensuring the practical alignment of APS systems with human values will be difficult due to problems with proxy optimization, rewarding deceptive and manipulative behavior, and the unusually high stakes of misalignment. The authors estimate that there is a 5% chance of existential catastrophe from AI by 2070, with a range from 0.1% to 40%. – AI-generated abstract.

Review of Joseph Carlsmith, 'Is power-seeking AI an existential risk?'

Abstract

PDF