Publication decisions for large language models, and their impacts
December 21, 2022
Abstract
The article analyzes publication decisions for large language models (LLMs) and their impact on the diffusion of LLM technology. It argues that the open publication of algorithmic details and hyperparameters has a significant effect on the diffusion of LLMs by lowering the computational barrier to replicating them. In contrast, the publication of training code, datasets, and trained models, while still having some impact, is less impactful than the publication of implementation details. The article also examines the rationales behind the different release strategies, which include commercial incentives, misuse concerns, and a reactive push for openness. The author predicts that publication decisions by leading LLM developers will become increasingly closed in the future, driven by incentives to maintain a capability advantage and protect commercial intellectual property. However, a reactive openness dynamic is expected to continue for at least the next five years, driven by the frustration of lower-resourced actors with their limited access to state-of-the-art models. – AI-generated abstract
