论文标题
跨学科AI天文台 - 回顾性分析和未来的矛盾
Transdisciplinary AI Observatory -- Retrospective Analyses and Future-Oriented Contradistinctions
论文作者
论文摘要
在过去的几年中,鉴于异类的安全至关重要和道德问题,AI安全获得了国际认可,这些问题风险掩盖了AI的广泛有益影响。在这种情况下,AI天文台努力的实施代表了一个关键的研究方向。本文激发了对固有的跨学科AI天文台方法的需求,该方法整合了多样化的回顾性和反事实观点。我们描述了目标和局限性,同时利用具体实践示例提供了动手奖励。区分无意的和故意触发具有不同社会心理技术影响的AI风险,我们体现了回顾性描述性分析,然后进行了回顾性的反事实风险分析。在这些AI天文台工具的基础上,我们介绍了AI安全的近期跨学科指南。作为进一步的贡献,我们通过两个不同的现代AI安全范式的镜头讨论了差异化和量身定制的长期方向。为简单起见,我们分别指具有人为愚蠢(AS)和永恒创造力(EC)的这两个不同的范式。尽管AS和EC都承认需要采用混合认知影响方法来实现AI安全性,并在许多短期考虑方面重叠,但它们在多个设想的长期解决方案模式的性质上有根本差异。通过汇编相关的潜在矛盾,我们旨在为实践和理论AI安全研究中的建设性辩证法提供未来的激励措施。
In the last years, AI safety gained international recognition in the light of heterogeneous safety-critical and ethical issues that risk overshadowing the broad beneficial impacts of AI. In this context, the implementation of AI observatory endeavors represents one key research direction. This paper motivates the need for an inherently transdisciplinary AI observatory approach integrating diverse retrospective and counterfactual views. We delineate aims and limitations while providing hands-on-advice utilizing concrete practical examples. Distinguishing between unintentionally and intentionally triggered AI risks with diverse socio-psycho-technological impacts, we exemplify a retrospective descriptive analysis followed by a retrospective counterfactual risk analysis. Building on these AI observatory tools, we present near-term transdisciplinary guidelines for AI safety. As further contribution, we discuss differentiated and tailored long-term directions through the lens of two disparate modern AI safety paradigms. For simplicity, we refer to these two different paradigms with the terms artificial stupidity (AS) and eternal creativity (EC) respectively. While both AS and EC acknowledge the need for a hybrid cognitive-affective approach to AI safety and overlap with regard to many short-term considerations, they differ fundamentally in the nature of multiple envisaged long-term solution patterns. By compiling relevant underlying contradistinctions, we aim to provide future-oriented incentives for constructive dialectics in practical and theoretical AI safety research.