论文标题
Mahiru:联合,政策驱动的数据处理和交换系统
Mahiru: a federated, policy-driven data processing and exchange system
论文作者
论文摘要
在学术界和外部,安全,保留隐私的科学或业务数据共享是研究与开发的流行主题。已经提出了系统来共享有关个人并共享整个数据集的个人事实,用于通过受信任的第三方共享数据,以通过匿名和同型加密来混淆敏感数据,用于在联合机器学习和安全的多部分计算中,以及用于交易数据访问或所有所有权或所有所有权。但是,这些系统通常仅支持这些解决方案之一,而组织通常具有多种数据和用例,这些数据和用例适用于不同的解决方案。如果可以构建一个足够灵活以支持各种解决方案的单个系统,则将大大简化管理并减少攻击表面。在本文中,我们介绍了Mahiru,这是一种用于数据交换和处理系统的设计,在该设计中,数据和软件的所有者可以完全控制其资产,用户可以提交各种各样的处理请求,包括上述大多数应用程序,并且所有各方都以分布式的方式协作以分布式方式执行这些请求,同时始终遵守政策。这是通过联合的,主要是分散的体系结构和强大的政策机制来实现的,旨在易于理解且易于实施。我们已经创建了一个公开可用和持续开发的系统的概念验证实现,我们旨在继续扩展新功能。
Secure, privacy-preserving sharing of scientific or business data is currently a popular topic for research and development, both in academia and outside of it. Systems have been proposed for sharing individual facts about individuals and sharing entire data sets, for sharing data through trusted third parties, for obfuscating sensitive data by anonymisation and homomorphic encryption, for distributed processing as in federated machine learning and secure multiparty computation, and for trading data access or ownership. However, these systems typically support only one of these solutions, while organisations often have a variety of data and use cases for which different solutions are appropriate. If a single system could be built that is flexible enough to support a variety of solutions, then administration would be greatly simplified and attack surfaces reduced. In this paper we present Mahiru, a design for a data exchange and processing system in which owners of data and software fully control their assets, users may submit a wide variety of processing requests including most of the above applications, and all parties collaborate to execute those requests in a distributed fashion, while ensuring that the policies are adhered to at all times. This is achieved through a federated, mostly decentralised architecture and a powerful policy mechanism designed to be easy to understand and simple to implement. We have created a proof-of-concept implementation of the system which is openly available and in continuous development, and which we aim to continue to extend with new functionality.