Collection of discussions of key cruxes related to AI safety/alignment

Michael Aird

March 28, 2020

Abstract

Comment by MichaelA - COLLECTION [HTTPS://FORUM.EFFECTIVEALTRUISM.ORG/POSTS/6TRT8MTSKFQJJBFJA/POST-MORE-SUMMARIES-AND-COLLECTIONS] OF DISCUSSIONS OF KEY CRUXES RELATED TO AI SAFETY/ALIGNMENT These are works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc. I’ve also included some works that attempt to clearly lay out a particular view in a way that could be particularly helpful for others trying to see where the cruxes are, even if the work itself don’t spend much time addressing alternative views. I’m not sure precisely where to draw the boundaries in order to make this collection maximally useful. These are ordered from most to least recent. I’ve put in bold those works that very subjectively seem to me especially worth reading. GENERAL, OR FOCUSED ON TECHNICAL WORK Ben Garfinkel on scrutinising classic AI risk arguments [https://80000hours.org/podcast/episodes/ben-garfinkel-classic-ai-risk-arguments/] - 80,000 Hours, 2020 Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ] - James Fodor, 2020; this received pushback from Rohin Shah, which resulted in a comment thread [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ?commentId=8eiishvfwgAiDrjqX#comments] worth adding here in its own right Fireside Chat: AI governance [https://www.youtube.com/watch?v=bSTYiIgjgrk&list=PLwp9xeoX5p8Nje_8jJsmkz5ork-dVk9wK&index=7] - Ben Garfinkel & Markus Anderljung, 2020 My personal cruxes for working on AI safety [https://forum.effectivealtruism.org/posts/Ayu5im98u8FeMWoBZ] - Buck Shlegeris, 2020 What can the principal-agent literature tell us about AI risk? [https://www.lesswrong.com/post

Collection of discussions of key cruxes related to AI safety/alignment

Abstract

PDF