works
Michael Aird Collection of discussions of key cruxes related to AI safety/alignment online Comment by MichaelA - COLLECTION [HTTPS://FORUM.EFFECTIVEALTRUISM.ORG/POSTS/6TRT8MTSKFQJJBFJA/POST-MORE-SUMMARIES-AND-COLLECTIONS] OF DISCUSSIONS OF KEY CRUXES RELATED TO AI SAFETY/ALIGNMENT These are works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc. I’ve also included some works that attempt to clearly lay out a particular view in a way that could be particularly helpful for others trying to see where the cruxes are, even if the work itself don’t spend much time addressing alternative views. I’m not sure precisely where to draw the boundaries in order to make this collection maximally useful. These are ordered from most to least recent. I’ve put in bold those works that very subjectively seem to me especially worth reading. GENERAL, OR FOCUSED ON TECHNICAL WORK Ben Garfinkel on scrutinising classic AI risk arguments [https://80000hours.org/podcast/episodes/ben-garfinkel-classic-ai-risk-arguments/] - 80,000 Hours, 2020 Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ] - James Fodor, 2020; this received pushback from Rohin Shah, which resulted in a comment thread [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ?commentId=8eiishvfwgAiDrjqX#comments] worth adding here in its own right Fireside Chat: AI governance [https://www.youtube.com/watch?v=bSTYiIgjgrk&list=PLwp9xeoX5p8Nje_8jJsmkz5ork-dVk9wK&index=7] - Ben Garfinkel & Markus Anderljung, 2020 My personal cruxes for working on AI safety [https://forum.effectivealtruism.org/posts/Ayu5im98u8FeMWoBZ] - Buck Shlegeris, 2020 What can the principal-agent literature tell us about AI risk? [https://www.lesswrong.com/post

Abstract

Comment by MichaelA - COLLECTION [HTTPS://FORUM.EFFECTIVEALTRUISM.ORG/POSTS/6TRT8MTSKFQJJBFJA/POST-MORE-SUMMARIES-AND-COLLECTIONS] OF DISCUSSIONS OF KEY CRUXES RELATED TO AI SAFETY/ALIGNMENT These are works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc. I’ve also included some works that attempt to clearly lay out a particular view in a way that could be particularly helpful for others trying to see where the cruxes are, even if the work itself don’t spend much time addressing alternative views. I’m not sure precisely where to draw the boundaries in order to make this collection maximally useful. These are ordered from most to least recent. I’ve put in bold those works that very subjectively seem to me especially worth reading. GENERAL, OR FOCUSED ON TECHNICAL WORK Ben Garfinkel on scrutinising classic AI risk arguments [https://80000hours.org/podcast/episodes/ben-garfinkel-classic-ai-risk-arguments/] - 80,000 Hours, 2020 Critical Review of ‘The Precipice’: A Reassessment of the Risks of AI and Pandemics [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ] - James Fodor, 2020; this received pushback from Rohin Shah, which resulted in a comment thread [https://forum.effectivealtruism.org/posts/2sMR7n32FSvLCoJLQ?commentId=8eiishvfwgAiDrjqX#comments] worth adding here in its own right Fireside Chat: AI governance [https://www.youtube.com/watch?v=bSTYiIgjgrk&list=PLwp9xeoX5p8Nje_8jJsmkz5ork-dVk9wK&index=7] - Ben Garfinkel & Markus Anderljung, 2020 My personal cruxes for working on AI safety [https://forum.effectivealtruism.org/posts/Ayu5im98u8FeMWoBZ] - Buck Shlegeris, 2020 What can the principal-agent literature tell us about AI risk? [https://www.lesswrong.com/post

PDF

First page of PDF