论文标题
关于高度可靠的代理设计的争论是人工智能安全的有用途径
Arguments about Highly Reliable Agent Designs as a Useful Path to Artificial Intelligence Safety
论文作者
论文摘要
存在几种不同的方法来确保未来的变革性人工智能(TAI)或人工超智能(ASI)系统的安全性,并且不同方法的支持者对近期工作以及未来系统的工作重要性或有用性提出了不同的辩论。高度可靠的代理设计(HRAD)是由机器情报研究所等人倡导的最具争议和雄心勃勃的方法之一,并且已经对它是否以及如何减少未来AI系统的风险提出了各种论点。为了减少有关AI安全性的辩论中的混乱,我们在这里进行了赖斯先前的讨论,该讨论收集并提出了四个中心论点,这些论点被用来证明HRAD是AI系统安全的道路。 我们的标题为参数(1)附带效用,(2)解次灌注,(3)精确规范和(4)预测。这些中的每一个都对未来的AI系统如何冒险提出了不同的,部分相互矛盾的主张。我们已经根据对已发表和非正式文献的审查以及与已经在该主题上陈述职位的专家进行磋商的假设和主张。最后,我们简要概述了针对每种方法的论点,总体上反对议程。
Several different approaches exist for ensuring the safety of future Transformative Artificial Intelligence (TAI) or Artificial Superintelligence (ASI) systems, and proponents of different approaches have made different and debated claims about the importance or usefulness of their work in the near term, and for future systems. Highly Reliable Agent Designs (HRAD) is one of the most controversial and ambitious approaches, championed by the Machine Intelligence Research Institute, among others, and various arguments have been made about whether and how it reduces risks from future AI systems. In order to reduce confusion in the debate about AI safety, here we build on a previous discussion by Rice which collects and presents four central arguments which are used to justify HRAD as a path towards safety of AI systems. We have titled the arguments (1) incidental utility,(2) deconfusion, (3) precise specification, and (4) prediction. Each of these makes different, partly conflicting claims about how future AI systems can be risky. We have explained the assumptions and claims based on a review of published and informal literature, along with consultation with experts who have stated positions on the topic. Finally, we have briefly outlined arguments against each approach and against the agenda overall.