论文标题

标签混乱学习以增强文本分类模型

Label Confusion Learning to Enhance Text Classification Models

论文作者

Guo, Biyang, Han, Songqiao, Han, Xiao, Huang, Hailiang, Lu, Ting

论文摘要

在培训文本分类模型中,将真正的标签表示为单速矢量是一种常见的做法。但是,单旋转表示可能无法充分反映实例和标签之间的关系,因为标签通常不是完全独立的,并且实例可能与实践中的多个标签有关。一hot表示不足倾向于训练模型过于自信,这可能会导致任意预测和模型过度拟合,尤其是对于混淆的数据集(具有非常相似标签的数据集)或嘈杂的数据集(具有标签错误的数据集)。尽管具有标签平滑(LS)的训练模型可以在某种程度上缓解此问题,但它仍然无法捕获标签之间的现实关系。在本文中,我们提出了一种新型的标签混乱模型(LCM),作为当前流行文本分类模型的增强组件。 LCM可以通过计算训练过程中实例和标签之间的相似性并产生更好的标签分布来替换原始的单热标签向量,从而提高了最终分类性能,从而通过计算实例和标签之间的相似性来学习标签的混乱以捕获标签之间的语义重叠。对五个文本分类基准数据集进行的广泛实验揭示了LCM对几种广泛使用的深度学习分类模型的有效性。进一步的实验还验证了LCM对混淆或嘈杂的数据集特别有用,并且优于标签平滑方法。

Representing a true label as a one-hot vector is a common practice in training text classification models. However, the one-hot representation may not adequately reflect the relation between the instances and labels, as labels are often not completely independent and instances may relate to multiple labels in practice. The inadequate one-hot representations tend to train the model to be over-confident, which may result in arbitrary prediction and model overfitting, especially for confused datasets (datasets with very similar labels) or noisy datasets (datasets with labeling errors). While training models with label smoothing (LS) can ease this problem in some degree, it still fails to capture the realistic relation among labels. In this paper, we propose a novel Label Confusion Model (LCM) as an enhancement component to current popular text classification models. LCM can learn label confusion to capture semantic overlap among labels by calculating the similarity between instances and labels during training and generate a better label distribution to replace the original one-hot label vector, thus improving the final classification performance. Extensive experiments on five text classification benchmark datasets reveal the effectiveness of LCM for several widely used deep learning classification models. Further experiments also verify that LCM is especially helpful for confused or noisy datasets and superior to the label smoothing method.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源