财务报表审核中的会计数据的联合和隐私的学习

论文标题

财务报表审核中的会计数据的联合和隐私的学习

Federated and Privacy-Preserving Learning of Accounting Data in Financial Statement Audits

论文作者

Schreyer, Marco, Sattarov, Timur, Borth, Damian

论文摘要

正在进行的“数字化转型”从根本上改变了审计证据的性质，记录和数量。如今，国际审计标准（ISA）要求审核员检查财务报表的大量基本数字会计记录。结果，审计公司还“数字化”了他们的分析能力并投资深度学习（DL），这是成功的机器学习子学科。 DL的应用提供了从多个客户（例如在同一行业或管辖权中运营的组织）学习专业审计模型的能力。通常，法规要求审核员遵守严格的数据机密性措施。同时，最近有趣的发现表明，大规模的DL模型容易受到泄漏敏感培训数据信息的影响。如今，在遵守数据保护法规的同时，审计公司如何应用DL模型往往仍然不清楚。在这项工作中，我们提出了一个联合学习框架，以培训DL模型，以审核多个客户的相关会计数据。该框架涵盖了差异隐私和拆分学习能力，以减轻模型推断中的数据机密性风险。我们评估了在三个现实世界中付款数据集中检测会计异常的方法。我们的结果提供了经验证据，表明审核员可以从DL模型中受益，这些模型从多个专有客户数据的来源积累知识。

The ongoing 'digital transformation' fundamentally changes audit evidence's nature, recording, and volume. Nowadays, the International Standards on Auditing (ISA) requires auditors to examine vast volumes of a financial statement's underlying digital accounting records. As a result, audit firms also 'digitize' their analytical capabilities and invest in Deep Learning (DL), a successful sub-discipline of Machine Learning. The application of DL offers the ability to learn specialized audit models from data of multiple clients, e.g., organizations operating in the same industry or jurisdiction. In general, regulations require auditors to adhere to strict data confidentiality measures. At the same time, recent intriguing discoveries showed that large-scale DL models are vulnerable to leaking sensitive training data information. Today, it often remains unclear how audit firms can apply DL models while complying with data protection regulations. In this work, we propose a Federated Learning framework to train DL models on auditing relevant accounting data of multiple clients. The framework encompasses Differential Privacy and Split Learning capabilities to mitigate data confidentiality risks at model inference. We evaluate our approach to detect accounting anomalies in three real-world datasets of city payments. Our results provide empirical evidence that auditors can benefit from DL models that accumulate knowledge from multiple sources of proprietary client data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题