论文标题
A级A:DENOSINGING学习信息提取的学习框架
TIER-A: Denoising Learning Framework for Information Extraction
论文作者
论文摘要
随着深度神经语言模型的发展,最近在信息提取方面取得了巨大进展。但是,深度学习模型经常在嘈杂的数据点上过度合适,导致性能差。在这项工作中,我们研究了信息熵在过度拟合过程中的作用,并提出了一个关键见解,即过度拟合是过度自信和熵减少的过程。在这种特性的促进下,我们提出了一个简单而有效的共同指导式培训框架层A,聚集的联合训练框架,具有温度校准和信息熵正则化。我们的框架由几种具有相同结构的神经模型组成。这些模型是共同训练的,我们避免通过引入温度和信息熵正规化过度拟合。对两个广泛使用但嘈杂的数据集(Tacred and Conll03)进行了广泛的实验,证明了我们的假设的正确性和框架的有效性。
With the development of deep neural language models, great progress has been made in information extraction recently. However, deep learning models often overfit on noisy data points, leading to poor performance. In this work, we examine the role of information entropy in the overfitting process and draw a key insight that overfitting is a process of overconfidence and entropy decreasing. Motivated by such properties, we propose a simple yet effective co-regularization joint-training framework TIER-A, Aggregation Joint-training Framework with Temperature Calibration and Information Entropy Regularization. Our framework consists of several neural models with identical structures. These models are jointly trained and we avoid overfitting by introducing temperature and information entropy regularization. Extensive experiments on two widely-used but noisy datasets, TACRED and CoNLL03, demonstrate the correctness of our assumption and the effectiveness of our framework.