works
Thomas Woodside and Dan Hendrycks Open problems in AI X-Risk [PAIS #5] online The AI safety community is facing the challenge of developing safe and beneficial artificial intelligence systems. This paper presents a list of open problems in AI safety, focusing on those that are amenable to empirical machine learning research and where it is possible to avoid capabilities externalities. The problems are categorized under four main headings: Alignment, Robustness, Monitoring, and Transparency. The authors explain the motivation for researching each problem, describe what researchers are currently doing, outline how advanced research could look, and assess the importance, neglectedness, and tractability of each problem. They also discuss the potential relationships between each problem and general capabilities, and they analyze how progress in each area could affect general capabilities. Finally, they consider criticisms that have been raised against each research area. – AI-generated abstract.

Open problems in AI X-Risk [PAIS #5]

Thomas Woodside and Dan Hendrycks

Effective Altruism Forum, June 10, 2022

Abstract

The AI safety community is facing the challenge of developing safe and beneficial artificial intelligence systems. This paper presents a list of open problems in AI safety, focusing on those that are amenable to empirical machine learning research and where it is possible to avoid capabilities externalities. The problems are categorized under four main headings: Alignment, Robustness, Monitoring, and Transparency. The authors explain the motivation for researching each problem, describe what researchers are currently doing, outline how advanced research could look, and assess the importance, neglectedness, and tractability of each problem. They also discuss the potential relationships between each problem and general capabilities, and they analyze how progress in each area could affect general capabilities. Finally, they consider criticisms that have been raised against each research area. – AI-generated abstract.

PDF

First page of PDF