论文标题

文档智能指标用于视觉丰富的文档评估

Document Intelligence Metrics for Visually Rich Document Evaluation

论文作者

DeGange, Jonathan, Gupta, Swapnil, Han, Zhuoyu, Wilkosz, Krzysztof, Karwan, Adam

论文摘要

在与文档智能相关的信息提取任务中,视觉上富裕文档(VRD)的处理非常重要。我们介绍了DI-Metrics,这是一个专门针对VRD模型评估的Python库,其中包括基于文本的基于文​​本,基于几何和层次指标,用于信息提取任务。我们使用DI-Metrics使用公开可用的绳索数据集评估信息提取性能,从而比较了三种SOTA模型和一种行业模型的性能。开源库可在GitHub上找到。

The processing of Visually-Rich Documents (VRDs) is highly important in information extraction tasks associated with Document Intelligence. We introduce DI-Metrics, a Python library devoted to VRD model evaluation comprising text-based, geometric-based and hierarchical metrics for information extraction tasks. We apply DI-Metrics to evaluate information extraction performance using publicly available CORD dataset, comparing performance of three SOTA models and one industry model. The open-source library is available on GitHub.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源