论文标题
KPI-BERT:财务报告的联合命名实体识别和关系提取模型
KPI-BERT: A Joint Named Entity Recognition and Relation Extraction Model for Financial Reports
论文作者
论文摘要
我们提出了KPI-Bert,该系统采用新颖的实体识别方法(NER)和关系提取(RE)来提取和链接关键绩效指标(KPI),例如来自现实世界中德国财务文件的公司的“收入”或“利息费用”。具体而言,我们引入了一个端到端可训练的体系结构,该体系结构基于来自变形金刚(BERT)的双向编码器表示,将复发性神经网络(RNN)与条件标签屏蔽相结合,然后再依次将标签实体分类为依次标记实体。我们的模型还引入了一个可学习的基于RNN的合并机制,并通过明确过滤不可能的关系来结合域专家知识。我们在新的德国财务报告的新实用数据集上实现了更高的预测性能,表现优于几个强大的基准,包括基于最先进的跨度实体标记方法。
We present KPI-BERT, a system which employs novel methods of named entity recognition (NER) and relation extraction (RE) to extract and link key performance indicators (KPIs), e.g. "revenue" or "interest expenses", of companies from real-world German financial documents. Specifically, we introduce an end-to-end trainable architecture that is based on Bidirectional Encoder Representations from Transformers (BERT) combining a recurrent neural network (RNN) with conditional label masking to sequentially tag entities before it classifies their relations. Our model also introduces a learnable RNN-based pooling mechanism and incorporates domain expert knowledge by explicitly filtering impossible relations. We achieve a substantially higher prediction performance on a new practical dataset of German financial reports, outperforming several strong baselines including a competing state-of-the-art span-based entity tagging approach.