使用强化学习的气候变化政策探索

论文标题

使用强化学习的气候变化政策探索

Climate Change Policy Exploration using Reinforcement Learning

论文作者

Wolf, Theodore

论文摘要

气候变化是人类面临的一个非常复杂的问题。当许多变量彼此相互作用时，人类可能很难理解非常大规模的气候变化问题的原因和影响。气候是一个动态系统，从长远来看，小变化可能会产生可观且不可预测的影响。了解如何以正确的方式推动该系统可以帮助我们找到气候变化的创造性解决方案。在这项研究中，我们将深入的强化学习和世界地球系统模型结合在一起，以找到并解释可持续未来的创造性策略。这是Strnad等人的作品的扩展。我们通过多个方向扩展方法和分析。我们使用四种不同的强化学习剂在复杂性上有所不同，以不同的方式探究环境并找到各种策略。环境是一种低复杂的世界地球系统模型，其目标是实现未来，在该未来中，通过制定不同的政策，可再生能源产生的所有能量都是为了实现未来。我们使用基于行星边界的奖励函数，我们修改该奖励功能，以迫使代理商找到更广泛的策略。为了偏爱适用性，我们通过注入噪声并使其完全可观察到环境，以了解这些因素对代理的学习的影响。

Climate Change is an incredibly complicated problem that humanity faces. When many variables interact with each other, it can be difficult for humans to grasp the causes and effects of the very large-scale problem of climate change. The climate is a dynamical system, where small changes can have considerable and unpredictable repercussions in the long term. Understanding how to nudge this system in the right ways could help us find creative solutions to climate change. In this research, we combine Deep Reinforcement Learning and a World-Earth system model to find, and explain, creative strategies to a sustainable future. This is an extension of the work from Strnad et al. where we extend on the method and analysis, by taking multiple directions. We use four different Reinforcement Learning agents varying in complexity to probe the environment in different ways and to find various strategies. The environment is a low-complexity World Earth system model where the goal is to reach a future where all the energy for the economy is produced by renewables by enacting different policies. We use a reward function based on planetary boundaries that we modify to force the agents to find a wider range of strategies. To favour applicability, we slightly modify the environment, by injecting noise and making it fully observable, to understand the impacts of these factors on the learning of the agents.

下载PDF全文

下载文献需遵守相关版权规定

论文标题