AMA: Paul christiano, alignment researcher
AI Alignment Forum, April 28, 2021
Abstract
This work is a question-and-answer session about alignment research, a research field that aims to ensure that artificial intelligence systems are aligned with human values. Questions cover a range of topics, including the alignment researcher’s own research and opinions on various alignment-related topics. The researcher discusses their theory of change for the Alignment Research Center, which aims to systematically lead to a better future. The researcher also shares insights on the engine game, a game that teaches players how to get better at aligning AI systems with human values. Human motivation in human-centered AI (HCH) and amplification schemes is discussed, with the researcher expressing concerns about motivational issues and the challenges of detecting training issues in these systems. The researcher also reflects on the alignment research landscape and offers advice for aspiring alignment researchers. The researcher also discusses the potential for AI-induced existential catastrophes, arguing that the risks of an AI point of no return occurring within five years are low. – AI-generated abstract.
