使用动力学 - 敏捷增强学习

论文标题

使用动力学 - 敏捷增强学习

Low-Thrust Orbital Transfer using Dynamics-Agnostic Reinforcement Learning

论文作者

Casas, Carlos M., Carro, Belen, Sanchez-Esguevillas, Antonio

论文摘要

对于新一代卫星操作，低头轨迹设计和机上控制仍然是两个最具挑战性的主题。当前实施的大多数解决方案都是基于参考轨迹，并导致次优燃料使用。其他解决方案基于需要定期更新的简单指导法律，从而增加了运营成本。尽管一些优化策略利用人工智能方法，但到目前为止所研究的所有方法都需要先前生成的数据或对卫星动力学的先验知识。这项研究使用无模型的加固学习来在受约束的围胎饲养方案上训练代理，以针对中等地球轨道卫星进行训练。代理没有对环境动力学的任何先验知识，这使其与经典轨迹优化模式无偏见。然后，训练有素的代理用于设计轨迹并在巡航过程中自主控制卫星。仿真表明，动态 - 不合时宜的代理能够学习准最佳的指导法，并对环境动态中的不确定性做出良好的反应。结果获得了在更复杂的场景，多卫星问题上使用加固学习的大门，或者在未知参考解决方案的环境中探索轨迹

Low-thrust trajectory design and in-flight control remain two of the most challenging topics for new-generation satellite operations. Most of the solutions currently implemented are based on reference trajectories and lead to sub-optimal fuel usage. Other solutions are based on simple guidance laws that need to be updated periodically, increasing the cost of operations. Whereas some optimization strategies leverage Artificial Intelligence methods, all of the approaches studied so far need either previously generated data or a strong a priori knowledge of the satellite dynamics. This study uses model-free Reinforcement Learning to train an agent on a constrained pericenter raising scenario for a low-thrust medium-Earth-orbit satellite. The agent does not have any prior knowledge of the environment dynamics, which makes it unbiased from classical trajectory optimization patterns. The trained agent is then used to design a trajectory and to autonomously control the satellite during the cruise. Simulations show that a dynamics-agnostic agent is able to learn a quasi-optimal guidance law and responds well to uncertainties in the environment dynamics. The results obtained open the door to the usage of Reinforcement Learning on more complex scenarios, multi-satellite problems, or to explore trajectories in environments where a reference solution is not known

下载PDF全文

下载文献需遵守相关版权规定

论文标题