Reflection Mechanisms as an Alignment Target - Attitudes on “near-term” AI

Eric Landgreve, Beth Barnes, and Marius Hobbhahn

Lesswrong, 2023

Abstract

TL;DR \textbullet * We survey 1000 participants on their views about what values should be put into powerful AIs that we think are plausible in the near-term (e.g. within 5-10 years) * We find that responden…

Reflection Mechanisms as an Alignment Target - Attitudes on “near-term” AI

Abstract

PDF