参数化差分动态编程

论文标题

参数化差分动态编程

Parameterized Differential Dynamic Programming

论文作者

Oshin, Alex, Houghton, Matthew D., Acheson, Michael J., Gregory, Irene M., Theodorou, Evangelos A.

论文摘要

差异动态编程（DDP）是依靠系统动力学和成本函数的二阶近似值的有效轨迹优化算法，最近已应用于使用时间不变的参数优化系统。先前的工作包括系统参数估计，并确定混合动力学系统模式之间的最佳切换时间。本文通过提出一般参数化最佳控制目标并得出DDP的参数版本，标题为参数化差异动态编程（PDDP）来概括以前的工作。提供了对算法进行严格的合并分析，而PDDP显示出最低成本的收敛，而不论初始化如何。分析了更有效地逃脱局部最小值的优化的影响。提出了实验，以同时解决多个机器人系统上的PDDP来解决模型预测控制（MPC）和移动视野估计（MHE）任务。最后，PDDP用于确定复杂的城市空气流动性（UAM）类车辆的飞行方式之间的最佳过渡点，该车辆表现出多个飞行阶段。

Differential Dynamic Programming (DDP) is an efficient trajectory optimization algorithm relying on second-order approximations of a system's dynamics and cost function, and has recently been applied to optimize systems with time-invariant parameters. Prior works include system parameter estimation and identifying the optimal switching time between modes of hybrid dynamical systems. This paper generalizes previous work by proposing a general parameterized optimal control objective and deriving a parametric version of DDP, titled Parameterized Differential Dynamic Programming (PDDP). A rigorous convergence analysis of the algorithm is provided, and PDDP is shown to converge to a minimum of the cost regardless of initialization. The effects of varying the optimization to more effectively escape local minima are analyzed. Experiments are presented applying PDDP on multiple robotics systems to solve model predictive control (MPC) and moving horizon estimation (MHE) tasks simultaneously. Finally, PDDP is used to determine the optimal transition point between flight regimes of a complex urban air mobility (UAM) class vehicle exhibiting multiple phases of flight.

下载PDF全文

下载文献需遵守相关版权规定

论文标题