大型生成模型的可预测性和惊喜

论文标题

大型生成模型的可预测性和惊喜

Predictability and Surprise in Large Generative Models

论文作者

Ganguli, Deep, Hernandez, Danny, Lovitt, Liane, DasSarma, Nova, Henighan, Tom, Jones, Andy, Joseph, Nicholas, Kernion, Jackson, Mann, Ben, Askell, Amanda, Bai, Yuntao, Chen, Anna, Conerly, Tom, Drain, Dawn, Elhage, Nelson, Showk, Sheer El, Fort, Stanislav, Hatfield-Dodds, Zac, Johnston, Scott, Kravec, Shauna, Nanda, Neel, Ndousse, Kamal, Olsson, Catherine, Amodei, Daniela, Amodei, Dario, Brown, Tom, Kaplan, Jared, McCandlish, Sam, Olah, Chris, Clark, Jack

论文摘要

大规模的预训练最近已成为一种创建能干的，通用模型的技术，例如GPT-3，Megatron-Turing NLG，Gopher等。在本文中，我们重点介绍了此类模型的违反直觉属性，并讨论了该属性的政策含义。也就是说，这些生成模型在广泛的训练分布中具有可预测损失的异常组合（如其“缩放定律”所体现），以及不可预测的特定功能，输入和输出。我们认为，有用能力的高级可预测性和外观驱动了这种模型的快速发展，而不可预测的品质使得很难预见模型部署的后果。我们介绍了这种组合如何通过文献和现实世界观察的示例导致社会有害行为的例子，我们还进行了两个新颖的实验，以说明我们对不可预测性的危害的观点。此外，我们分析了这些矛盾的属性如何结合起来，从而为模型开发人员提供各种部署这些模型的动机，以及可以阻碍部署的挑战。我们以AI社区可能会增加这些模型产生有益影响的机会的可能干预措施的清单结束。我们打算对想要理解和规范AI系统的决策者，关心工作的潜在政策影响的技术人员以及想要分析，批评并潜在发展大型生成模型的学者有用。

Large-scale pre-training has recently emerged as a technique for creating capable, general purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, these generative models have an unusual combination of predictable loss on a broad training distribution (as embodied in their "scaling laws"), and unpredictable specific capabilities, inputs, and outputs. We believe that the high-level predictability and appearance of useful capabilities drives rapid development of such models, while the unpredictable qualities make it difficult to anticipate the consequences of model deployment. We go through examples of how this combination can lead to socially harmful behavior with examples from the literature and real world observations, and we also perform two novel experiments to illustrate our point about harms from unpredictability. Furthermore, we analyze how these conflicting properties combine to give model developers various motivations for deploying these models, and challenges that can hinder deployment. We conclude with a list of possible interventions the AI community may take to increase the chance of these models having a beneficial impact. We intend this paper to be useful to policymakers who want to understand and regulate AI systems, technologists who care about the potential policy impact of their work, and academics who want to analyze, critique, and potentially develop large generative models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题