Alignment tax
LessWrong Wiki, October 25, 2022
Abstract
An alignment tax is the extra cost of ensuring that an AI system is aligned, relative to the cost of building an unaligned alternative. In the best case scenario, there is no tax, and we might as well align the system. In the worst case scenario, the tax is prohibitive, and alignment is functionally impossible. We expect something in between these two scenarios to be the case. Paul Christiano distinguishes two main approaches for dealing with the alignment tax: reducing the tax and paying the tax. – AI-generated abstract.
