works
LessWrong Alignment tax online An alignment tax is the extra cost of ensuring that an AI system is aligned, relative to the cost of building an unaligned alternative. In the best case scenario, there is no tax, and we might as well align the system. In the worst case scenario, the tax is prohibitive, and alignment is functionally impossible. We expect something in between these two scenarios to be the case. Paul Christiano distinguishes two main approaches for dealing with the alignment tax: reducing the tax and paying the tax. – AI-generated abstract.

Alignment tax

LessWrong

LessWrong Wiki, October 25, 2022

Abstract

An alignment tax is the extra cost of ensuring that an AI system is aligned, relative to the cost of building an unaligned alternative. In the best case scenario, there is no tax, and we might as well align the system. In the worst case scenario, the tax is prohibitive, and alignment is functionally impossible. We expect something in between these two scenarios to be the case. Paul Christiano distinguishes two main approaches for dealing with the alignment tax: reducing the tax and paying the tax. – AI-generated abstract.

PDF

First page of PDF