论文标题
桌子结构识别并有条件注意
Table Structure Recognition with Conditional Attention
论文作者
论文摘要
数字文档中的表格数据被广泛用于为读者表达紧凑和重要信息。但是,由于表结构的复杂性和元信息的缺失,从非结构化数字文档(例如PDF和图像)中解析表格,例如PDF和图像,这是一项挑战。表结构识别(TSR)问题旨在识别表的结构,并将非结构化表转换为结构化和机器可读的格式,以便可以通过下游任务(例如语义建模和信息检索)进一步分析表格数据。在这项研究中,我们假设复杂的表结构可以由一个图表示,该图分别代表细胞之间的细胞和关联。然后,我们将表结构识别问题定义为细胞关联分类问题,并提出条件注意网络(CATT-NET)。实验结果证明了我们所提出的方法比各种数据集上的最新方法的优越性。此外,我们研究了单元边界框或以文本为中心的方法的对齐方式对模型性能有更大的影响。由于缺乏基于这两种方法的公共数据集注释,我们进一步注释ICDAR2013数据集提供两种类型的边界框,这可以是用于评估该字段方法的新基准数据集。实验结果表明,单元边界框的比对可以有助于将微平均F1得分从0.915提高到0.963,而宏观平均F1得分从0.787到0.923。
Tabular data in digital documents is widely used to express compact and important information for readers. However, it is challenging to parse tables from unstructured digital documents, such as PDFs and images, into machine-readable format because of the complexity of table structures and the missing of meta-information. Table Structure Recognition (TSR) problem aims to recognize the structure of a table and transform the unstructured tables into a structured and machine-readable format so that the tabular data can be further analysed by the down-stream tasks, such as semantic modeling and information retrieval. In this study, we hypothesize that a complicated table structure can be represented by a graph whose vertices and edges represent the cells and association between cells, respectively. Then we define the table structure recognition problem as a cell association classification problem and propose a conditional attention network (CATT-Net). The experimental results demonstrate the superiority of our proposed method over the state-of-the-art methods on various datasets. Besides, we investigate whether the alignment of a cell bounding box or a text-focused approach has more impact on the model performance. Due to the lack of public dataset annotations based on these two approaches, we further annotate the ICDAR2013 dataset providing both types of bounding boxes, which can be a new benchmark dataset for evaluating the methods in this field. Experimental results show that the alignment of a cell bounding box can help improve the Micro-averaged F1 score from 0.915 to 0.963, and the Macro-average F1 score from 0.787 to 0.923.