The replication and emulation of GPT-3
December 21, 2022
Abstract
The article explores the resources required to replicate GPT-3 and the implications for the diffusion of large language models. The author estimates the compute cost and talent necessary to replicate GPT-3 using the OPT-175B model as a case study and the Hoffmann scaling laws as a theoretical model. The article concludes that the replication cost has significantly decreased in the last two years due to improvements in GPU price-performance and the publication of information and open-source software. The author also finds that the development of GPT-3-like models has been limited to well-funded companies and collaborations involving companies, academia, and government entities. The article argues that limiting access to compute is a promising way to mitigate the harms of diffusion, and that attention to information is just as important as the sharing or publication of information in shaping the diffusion of AI technology. – AI-generated abstract
