论文标题
联邦学习中的隐私保护:从GDPR角度来看的有见地的调查
Privacy Preservation in Federated Learning: An insightful survey from the GDPR Perspective
论文作者
论文摘要
随着AI和基于机器学习的应用程序和服务的开花,数据隐私和安全性也已成为一个关键挑战。通常,在对机器学习模型进行培训的数据中心中收集和汇总数据。这种集中式方法对个人数据泄漏,滥用和滥用造成了严重的隐私风险。此外,在物联网和大数据的时代,数据本质上是分发的,将大量数据传输到数据中心以进行处理似乎是一个麻烦的解决方案。这不仅是因为在跨数据源传输和共享数据的困难,而且还因为遵守严格的数据保护法规以及复杂的管理程序(例如欧盟一般数据保护法规(GDPR))所面临的挑战。在这方面,联邦学习(FL)是一种前瞻性解决方案,可以促进分布的协作学习,而无需透露原始培训数据,同时自然遵守GDPR。最近的研究表明,在FL中保留数据和计算不足以用于隐私保证。这是因为FL系统中各方之间交换的ML模型参数仍然隐藏了敏感信息,可以在某些隐私攻击中利用这些信息。因此,FL系统应通过有效的隐私技术来赋予GDPR。本文致力于调查最先进的隐私技术,该技术可以以系统的方式采用FL,以及这些技术如何减轻数据安全和隐私风险。此外,根据GDPR监管指南,FL系统应实施以符合GDPR的挑战,并提供对挑战的见解。
Along with the blooming of AI and Machine Learning-based applications and services, data privacy and security have become a critical challenge. Conventionally, data is collected and aggregated in a data centre on which machine learning models are trained. This centralised approach has induced severe privacy risks to personal data leakage, misuse, and abuse. Furthermore, in the era of the Internet of Things and big data in which data is essentially distributed, transferring a vast amount of data to a data centre for processing seems to be a cumbersome solution. This is not only because of the difficulties in transferring and sharing data across data sources but also the challenges on complying with rigorous data protection regulations and complicated administrative procedures such as the EU General Data Protection Regulation (GDPR). In this respect, Federated learning (FL) emerges as a prospective solution that facilitates distributed collaborative learning without disclosing original training data whilst naturally complying with the GDPR. Recent research has demonstrated that retaining data and computation on-device in FL is not sufficient enough for privacy-guarantee. This is because ML model parameters exchanged between parties in an FL system still conceal sensitive information, which can be exploited in some privacy attacks. Therefore, FL systems shall be empowered by efficient privacy-preserving techniques to comply with the GDPR. This article is dedicated to surveying on the state-of-the-art privacy-preserving techniques which can be employed in FL in a systematic fashion, as well as how these techniques mitigate data security and privacy risks. Furthermore, we provide insights into the challenges along with prospective approaches following the GDPR regulatory guidelines that an FL system shall implement to comply with the GDPR.