论文标题
部分可观测时空混沌系统的无模型预测
Achieving Transparency in Distributed Machine Learning with Explainable Data Collaboration
论文作者
论文摘要
用于确保其道德使用至关重要的机器学习模型的透明度对于确保其道德使用至关重要。为此,诸如Shap(Shapley添加说明)之类的功能归因方法被广泛用于向客户和开发人员解释黑盒机器学习模型的预测。但是,平行趋势是与其他数据持有人合作培训机器学习模型而无需访问其数据。通过水平或垂直分区数据训练的模型对可解释的AI提出了挑战,因为解释方可能对背景数据有偏见或对特征空间的部分视图。结果,从分布式机器学习的不同参与者那里获得的解释可能与彼此不一致,从而破坏了对产品的信任。本文介绍了一个可解释的数据协作框架,该框架基于模型 - 不合时宜的添加功能归因算法(内核变形)和隐私保护分布式机器学习的数据协作方法。特别是,我们提出了三种算法,用于数据协作中不同方案的不同方案,并验证了它们与开放式数据集中实验的一致性。我们的结果表明,分布式机器学习用户的功能归因差异显着(至少为1.75倍)。
Transparency of Machine Learning models used for decision support in various industries becomes essential for ensuring their ethical use. To that end, feature attribution methods such as SHAP (SHapley Additive exPlanations) are widely used to explain the predictions of black-box machine learning models to customers and developers. However, a parallel trend has been to train machine learning models in collaboration with other data holders without accessing their data. Such models, trained over horizontally or vertically partitioned data, present a challenge for explainable AI because the explaining party may have a biased view of background data or a partial view of the feature space. As a result, explanations obtained from different participants of distributed machine learning might not be consistent with one another, undermining trust in the product. This paper presents an Explainable Data Collaboration Framework based on a model-agnostic additive feature attribution algorithm (KernelSHAP) and Data Collaboration method of privacy-preserving distributed machine learning. In particular, we present three algorithms for different scenarios of explainability in Data Collaboration and verify their consistency with experiments on open-access datasets. Our results demonstrated a significant (by at least a factor of 1.75) decrease in feature attribution discrepancies among the users of distributed machine learning.