AMOD系统中的强大而有限的多机构增强电动汽车重新平衡方法

论文标题

AMOD系统中的强大而有限的多机构增强电动汽车重新平衡方法

A Robust and Constrained Multi-Agent Reinforcement Learning Electric Vehicle Rebalancing Method in AMoD Systems

论文作者

He, Sihong, Wang, Yue, Han, Shuo, Zou, Shaofeng, Miao, Fei

论文摘要

电动汽车（EV）在自动启动的按需（AMOD）系统中起关键作用，但是它们的独特充电模式增加了AMOD系统中的模型不确定性（例如，状态过渡概率）。由于通常存在培训和测试/真实环境之间的不匹配，因此将模型不确定性纳入系统设计至关重要。但是，尚未在现有文献重新平衡的EV AMOD系统重新平衡中明确考虑模型不确定性，并且模型不确定性和决策应满足的限制的共存使问题更加具有挑战性。在这项工作中，我们设计了一个强大而有限的多代理增强学习（MARL）框架，具有EV AMOD系统的状态过渡内核的不确定性。然后，我们提出了一种强大而受约束的MARL算法（ROCOMA），其自然政策梯度（RNPG）训练强大的EV重新平衡政策，以平衡供需比率和在模型不确定性下整个城市的充电率。实验表明，Rocoma可以学习有效且强大的重新平衡政策。在存在模型不确定性的情况下，它的表现优于非稳定MAL方法。它使系统的公平性增加了19.6％，并将重新平衡成本降低了75.8％。

Electric vehicles (EVs) play critical roles in autonomous mobility-on-demand (AMoD) systems, but their unique charging patterns increase the model uncertainties in AMoD systems (e.g. state transition probability). Since there usually exists a mismatch between the training and test/true environments, incorporating model uncertainty into system design is of critical importance in real-world applications. However, model uncertainties have not been considered explicitly in EV AMoD system rebalancing by existing literature yet, and the coexistence of model uncertainties and constraints that the decision should satisfy makes the problem even more challenging. In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with state transition kernel uncertainty for EV AMoD systems. We then propose a robust and constrained MARL algorithm (ROCOMA) with robust natural policy gradients (RNPG) that trains a robust EV rebalancing policy to balance the supply-demand ratio and the charging utilization rate across the city under model uncertainty. Experiments show that the ROCOMA can learn an effective and robust rebalancing policy. It outperforms non-robust MARL methods in the presence of model uncertainties. It increases the system fairness by 19.6% and decreases the rebalancing costs by 75.8%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题