Thinking inside the box: controlling and using an oracle AI

Stuart Armstrong, Anders Sandberg, and Nick Bostrom

Minds and machines, vol. 22, no. 4, 2012, pp. 299–324

Abstract

There is no strong reason to believe that human-level intelligence represents an upper limit of the capacity of artificial intelligence, should it be realized. This poses serious safety issues, since a superintelligent system would have great power to direct the future according to its possibly flawed motivation system. Solving this issue in general has proven to be considerably harder than expected. This paper looks at one particular approach, Oracle AI. An Oracle AI is an AI that does not act in the world except by answering questions. Even this narrow approach presents considerable challenges. In this paper, we analyse and critique various methods of controlling the AI. In general an Oracle AI might be safer than unrestricted AI, but still remains potentially dangerous.

Thinking inside the box: controlling and using an oracle AI

Abstract

PDF