您有几个标签者？仔细观察金标准标签

论文标题

您有几个标签者？仔细观察金标准标签

How many labelers do you have? A closer look at gold-standard labels

论文作者

Cheng, Chen, Asi, Hilal, Duchi, John

论文摘要

大多数监督学习数据集的构建围绕为每个实例收集多个标签，然后汇总标签以形成一种“金标准”。我们通过开发该过程的（风格化的）理论模型并分析其统计后果的理论模型来质疑该管道的智慧，并表明如何访问非聚集的标签信息可以使训练良好的模型比使用金标准品牌更可行。然而，整个故事都是微妙的，汇总和填充标签信息之间的对比取决于问题的细节，在该信息中，使用汇总信息的估计器表现出强大但较慢的收敛速率，而如果它们具有（或可以学习）真实的标记过程，则可以有效利用所有标签的估计器更快地收敛。该理论对现实世界数据集做出了一些预测，包括何时非聚集标签应改善学习绩效，我们测试以证实我们的预测有效性。

The construction of most supervised learning datasets revolves around collecting multiple labels for each instance, then aggregating the labels to form a type of "gold-standard". We question the wisdom of this pipeline by developing a (stylized) theoretical model of this process and analyzing its statistical consequences, showing how access to non-aggregated label information can make training well-calibrated models more feasible than it is with gold-standard labels. The entire story, however, is subtle, and the contrasts between aggregated and fuller label information depend on the particulars of the problem, where estimators that use aggregated information exhibit robust but slower rates of convergence, while estimators that can effectively leverage all labels converge more quickly if they have fidelity to (or can learn) the true labeling process. The theory makes several predictions for real-world datasets, including when non-aggregate labels should improve learning performance, which we test to corroborate the validity of our predictions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题