Where I agree and disagree with Eliezer

Paul Christiano

AI Alignment Forum, June 19, 2022

Abstract

Disagreements and agreements between Paul Christiano and Eliezer Yudkowsky regarding catastrophic risks from misaligned artificial general intelligence (AGI) are identified and explained. Christiano agrees with Yudkowsky on the probability of deliberate disempowerment of humanity by powerful AIs and the associated risks of existential catastrophe. They also agree that current efforts addressing AGI alignment are inadequate and that straightforward attempts to prevent the construction of powerful AIs are unfeasible. Both consider that policy responses to AGI risks are likely to be ineffective or counterproductive. However, the authors disagree on the imminence of catastrophic risks, with Christiano emphasizing the possibility of years or even months before powerful AGIs materialize. They also diverge on the prospects of averting catastrophe through technological fixes or social and political solutions. Christiano presents a more nuanced view of AI development trajectories and capabilities, criticizing Yudkowsky’s focus on extreme scenarios. Furthermore, Christiano questions the effectiveness of certain proposed alignment strategies. – AI-generated abstract.

Where I agree and disagree with Eliezer

Abstract

PDF