利用实例功能在程序化弱监督中的标签聚合

论文标题

利用实例功能在程序化弱监督中的标签聚合

Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision

论文作者

Zhang, Jieyu, Song, Linxin, Ratner, Alexander

论文摘要

程序化弱监督（PWS）已成为有效合成培训标签的广泛范式。 PWS的核心组成部分是标签模型，该模型通过汇总了被抽象为标签功能（LFS）的多个嘈杂监督源的输出来填充真正的标签。现有的统计标签模型通常仅依赖于LF的输出，在建模基础生成过程时忽略了实例功能。在本文中，我们尝试通过拟议的寓言将实例特征纳入统计标签模型中。特别是，它建立在贝叶斯标签模型的混合物上，每个模型都与全局相关模式相对应，并且混合物组件的系数由基于实例特征的高斯过程分类器预测。我们采用基于辅助变量的变异推理算法来解决高斯工艺和贝叶斯标签模型之间的非偶联问题。对11个基准数据集进行了广泛的经验比较，可以看到寓言实现了九个基线的最高平均性能。

Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources abstracted as labeling functions (LFs). Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process. In this paper, we attempt to incorporate the instance features into a statistical label model via the proposed FABLE. In particular, it is built on a mixture of Bayesian label models, each corresponding to a global pattern of correlation, and the coefficients of the mixture components are predicted by a Gaussian Process classifier based on instance features. We adopt an auxiliary variable-based variational inference algorithm to tackle the non-conjugate issue between the Gaussian Process and Bayesian label models. Extensive empirical comparison on eleven benchmark datasets sees FABLE achieving the highest averaged performance across nine baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题