Clarifying Inner Alignment Terminology
AI Alignment Forum, November 9, 2020
Abstract
The concepts of outer and inner alignment are essential for ensuring that an AI system’s behavior aligns with human values. This article clarifies the various definitions and implications of these concepts. It introduces a diagram illustrating the relationship between different alignment types and provides formal definitions for each term. Additionally, the article addresses frequently asked questions to further elucidate the relationships among the concepts. – AI-generated abstract.
