行为建模的评估指标

论文标题

行为建模的评估指标

Evaluation metrics for behaviour modeling

论文作者

Im, Daniel Jiwoong, Kwak, Iljung, Branson, Kristin

论文摘要

在大型数据集中无监督发现结构的主要困难是缺乏定量评估标准。在这项工作中，我们提出并研究了几个指标，以评估和比较使用模仿学习学习的行为模型。与普遍使用的模型对数可能性相比，如果行为具有某些本质上不可预测的属性，并且突出了模型产生的行为的整体分布中，则这些标准着眼于行为的较长时间关系。指标与真实的过去信息相比，将真实的指标与模型预测的轨迹进行比较。分布指标比较了模型模拟行为的统计数据，并受到实验生物学家如何评估操纵对动物行为的影响的启发。我们表明，所提出的指标与生物学家对行为的直觉相对应，并允许我们评估模型，了解其偏见并使我们能够提出新的研究方向。

A primary difficulty with unsupervised discovery of structure in large data sets is a lack of quantitative evaluation criteria. In this work, we propose and investigate several metrics for evaluating and comparing generative models of behavior learned using imitation learning. Compared to the commonly-used model log-likelihood, these criteria look at longer temporal relationships in behavior, are relevant if behavior has some properties that are inherently unpredictable, and highlight biases in the overall distribution of behaviors produced by the model. Pointwise metrics compare real to model-predicted trajectories given true past information. Distribution metrics compare statistics of the model simulating behavior in open loop, and are inspired by how experimental biologists evaluate the effects of manipulations on animal behavior. We show that the proposed metrics correspond with biologists' intuitions about behavior, and allow us to evaluate models, understand their biases, and enable us to propose new research directions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题