论文标题
一项有关学习和模拟个性化汽车驾驶风格的研究
A Study on Learning and Simulating Personalized Car-Following Driving Style
论文作者
论文摘要
自动化的车辆逐渐进入人们的日常生活,为用户提供舒适的驾驶体验。通用和用户不合时宜的自动化车辆的能力有限,可容纳不同用户的不同驾驶方式。这种限制不仅会影响用户的满意度,而且会引起安全问题。从用户演示中学习可以提供有关用户驱动偏好的直接见解。但是,很难用有限的数据理解驾驶员的偏好。在这项研究中,我们使用一种无模型的逆增强学习方法来从自然主义驾驶数据集中研究驾驶员的特征,并表明此方法能够用奖励功能代表用户的偏好。为了预测数据有限的驾驶员的驾驶方式,我们应用高斯混合物模型,并将特定驱动程序的相似性计算到驱动程序群中。我们通过部分可观察到的马尔可夫决策过程(POMDP)模型设计了个性化的自适应巡航控制系统(P-ACC)系统。模型以模仿驱动器的驾驶方式的奖励功能已集成,并限制了相对距离以确保驾驶安全性。驾驶方式的预测可实现85.7%的准确性,而少于10个汽车的事件的数据。基于模型的实验驾驶轨迹表明,P-ACC系统可以提供个性化的驾驶体验。
Automated vehicles are gradually entering people's daily life to provide a comfortable driving experience for the users. The generic and user-agnostic automated vehicles have limited ability to accommodate the different driving styles of different users. This limitation not only impacts users' satisfaction but also causes safety concerns. Learning from user demonstrations can provide direct insights regarding users' driving preferences. However, it is difficult to understand a driver's preference with limited data. In this study, we use a model-free inverse reinforcement learning method to study drivers' characteristics in the car-following scenario from a naturalistic driving dataset, and show this method is capable of representing users' preferences with reward functions. In order to predict the driving styles for drivers with limited data, we apply Gaussian Mixture Models and compute the similarity of a specific driver to the clusters of drivers. We design a personalized adaptive cruise control (P-ACC) system through a partially observable Markov decision process (POMDP) model. The reward function with the model to mimic drivers' driving style is integrated, with a constraint on the relative distance to ensure driving safety. Prediction of the driving styles achieves 85.7% accuracy with the data of less than 10 car-following events. The model-based experimental driving trajectories demonstrate that the P-ACC system can provide a personalized driving experience.