使用多模式深神经网络在社交媒体平台上检测基于交易的逃税活动

论文标题

使用多模式深神经网络在社交媒体平台上检测基于交易的逃税活动

Detecting Transaction-based Tax Evasion Activities on Social Media Platforms Using Multi-modal Deep Neural Networks

论文作者

Zhang, Lelin, Nan, Xi, Huang, Eva, Liu, Sidong

论文摘要

社交媒体平台现在通过提供方便的交流手段，内容共享甚至不同用户之间的付款来为数十亿用户提供服务。由于如此方便和无政府状态的性质，它们也大量地用于促进和开展未注册的市场参与者之间的商业活动，而无需缴税。全世界税务机构在通过传统的监管手段来规范这些隐藏的经济活动方面面临困难。本文为国际税务部门提供了一种基于机器学习的雷格技术工具，以在社交媒体平台上检测基于交易的逃税活动。为了构建这样的工具，我们收集了一个58,660个Instagram帖子的数据集，并手动标记了2,081个采样帖子，这些帖子具有与基于交易的逃税活动有关的多个属性。根据数据集，我们开发了一个多模式深神经网络，以自动检测可疑帖子。提出的模型结合了评论，主题标签和图像模式，以产生最终输出。如我们的实验所示，组合模型的AUC为0.808和F1得分为0.762，表现优于任何单个模式模型。该工具可以帮助税务机构以有效而有效的方式确定审计目标，并在规模上打击社会电子商务逃税。

Social media platforms now serve billions of users by providing convenient means of communication, content sharing and even payment between different users. Due to such convenient and anarchic nature, they have also been used rampantly to promote and conduct business activities between unregistered market participants without paying taxes. Tax authorities worldwide face difficulties in regulating these hidden economy activities by traditional regulatory means. This paper presents a machine learning based Regtech tool for international tax authorities to detect transaction-based tax evasion activities on social media platforms. To build such a tool, we collected a dataset of 58,660 Instagram posts and manually labelled 2,081 sampled posts with multiple properties related to transaction-based tax evasion activities. Based on the dataset, we developed a multi-modal deep neural network to automatically detect suspicious posts. The proposed model combines comments, hashtags and image modalities to produce the final output. As shown by our experiments, the combined model achieved an AUC of 0.808 and F1 score of 0.762, outperforming any single modality models. This tool could help tax authorities to identify audit targets in an efficient and effective manner, and combat social e-commerce tax evasion in scale.

下载PDF全文

下载文献需遵守相关版权规定

论文标题