AI safety needs social scientists
EAGlobal, March 4, 2019
Abstract
Ensuring that advanced artificial intelligence (AI) systems are reliably aligned with human values requires resolving uncertainties related to the psychology of human rationality, emotion, and biases. These uncertainties can only be resolved empirically through experimentation. The authors argue that the AI safety community needs to invest research effort in the human side of AI alignment, using social scientists with experience in human cognition, behavior, and ethics. They propose conducting experiments involving humans playing the role of AI agents, replacing machine learning (ML) agents with people. The authors provide a specific proposal for learning reasoning-oriented alignment, called debate, where two AI agents engage in a debate about the correct answer, then show the transcript of the debate to a human to judge. They discuss the potential pitfalls and benefits of such an approach, and outline a series of questions that social science experiments can help to answer about the quality of human judgments. – AI-generated abstract.
