Beware safety-washing

Lizka Vaintrob

January 13, 2023

Abstract

The article discusses the phenomenon of ‘safety-washing’ in the context of Artificial Intelligence (AI) development. The author draws parallels with ‘greenwashing’ and ‘humanewashing’, where companies make misleading claims about their environmental and ethical practices to gain public approval. Safety-washing involves AI companies exaggerating their commitment to safety, potentially obscuring genuine efforts and leading to a false sense of security. The article lists various tactics used in safety-washing, including focusing on specific, convenient safety paradigms, conflating safety with other desirable AI attributes, and promoting misleading narratives about the seriousness of AI risks. The author discusses the potential harms of safety-washing, such as confusion regarding genuine safety concerns, incentivizing companies to prioritize marketing over real safety measures, and hindering the development of effective AI risk mitigation strategies. The author proposes several strategies for combating safety-washing, including promoting clearer definitions of AI safety, encouraging external validation of AI development, and openly criticizing organizations engaging in such practices. – AI-generated abstract.

Beware safety-washing

Abstract

PDF