Crowdlab：有监督的学习来推断具有多个注释者的数据的共识标签和质量分数

论文标题

Crowdlab：有监督的学习来推断具有多个注释者的数据的共识标签和质量分数

CROWDLAB: Supervised learning to infer consensus labels and quality scores for data with multiple annotators

论文作者

Goh, Hui Wen, Tkachenko, Ulyana, Mueller, Jonas

论文摘要

用于分类的现实世界数据通常由多个注释者标记。为了分析此类数据，我们介绍了CrowdLab，这是一种直接的方法，可以利用任何训练有素的分类器来估算：（1）每个示例的共识标签，汇总了可用的注释；（2）每个共识标签正确的可能性是正确的；（3）每个注释者的评级，量化标签的总体正确性。估计众包中相关数量的现有算法通常依赖于具有迭代推断的复杂生成模型。 Crowdlab使用直接加权的合奏。现有算法通常仅依赖于注释统计，而忽略了注释得出的示例的特征。 CrowdLab利用任何对这些功能训练的分类器模型，因此可以更好地在具有相似功能的示例之间进行概括。在现实世界的多通道图像数据上，我们提出的方法比Dawid-skene/Glad等现有算法提供了对（1） - （3）的较高估计。

Real-world data for classification is often labeled by multiple annotators. For analyzing such data, we introduce CROWDLAB, a straightforward approach to utilize any trained classifier to estimate: (1) A consensus label for each example that aggregates the available annotations; (2) A confidence score for how likely each consensus label is correct; (3) A rating for each annotator quantifying the overall correctness of their labels. Existing algorithms to estimate related quantities in crowdsourcing often rely on sophisticated generative models with iterative inference. CROWDLAB instead uses a straightforward weighted ensemble. Existing algorithms often rely solely on annotator statistics, ignoring the features of the examples from which the annotations derive. CROWDLAB utilizes any classifier model trained on these features, and can thus better generalize between examples with similar features. On real-world multi-annotator image data, our proposed method provides superior estimates for (1)-(3) than existing algorithms like Dawid-Skene/GLAD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题