论文标题
事件数据质量:调查
Event Data Quality: A Survey
论文作者
论文摘要
事件数据在当今金融交易,业务工作流和工业物联网等不同领域中很普遍。事件通常以几种属性为特征,该属性表示与相应的发生时间/持续时间相关的含义。从企业的传统运营系统到用于Web服务的在线系统,事件数据是从物理世界中不间断地生成的。但是,由于大数据的多样性和准确性,从异质和肮脏来源生成的事件数据可能具有非常不同的事件表示形式和数据质量问题。在这项工作中,我们总结了一些有关研究事件数据数据质量问题的典型著作,包括:(1)事件匹配,(2)事件错误检测,(3)事件数据修复以及(4)近似模式匹配。
Event data are prevalent in diverse domains such as financial trading, business workflows and industrial IoT nowadays. An event is often characterized by several attributes denoting the meaning associated with the corresponding occurrence time/duration. From traditional operational systems in enterprises to online systems for Web services, event data is generated from physical world uninterruptedly. However, due to the variety and veracity features of Big data, event data generated from heterogeneous and dirty sources could have very different event representations and data quality issues. In this work, we summarize several typical works on studying data quality issues of event data, including: (1) event matching, (2) event error detection, (3) event data repair, and (4) approximate pattern matching.