Captum：Pytorch的统一且通用的模型可解释性库

论文标题

Captum：Pytorch的统一且通用的模型可解释性库

Captum: A unified and generic model interpretability library for PyTorch

论文作者

Kokhlikyan, Narine, Miglani, Vivek, Martin, Miguel, Wang, Edward, Alsallakh, Bilal, Reynolds, Jonathan, Melnikov, Alexander, Kliushkina, Natalia, Araya, Carlos, Yan, Siqi, Reblitz-Richardson, Orion

论文摘要

在本文中，我们介绍了Pytorch的小说，统一的开源模型可解释性库[12]。该库包含许多基于梯度和扰动归因算法的通用实现，也称为特征，神经元和层重要性算法，以及这些算法的一组评估指标。它可用于分类和非分类模型，包括建立在神经网络（NN）上的图形结构模型。在本文中，我们提供了支持的归因算法的高级概述，并展示了如何执行记忆效率和可扩展的计算。我们强调，库的三个主要特征是多模式，可扩展性和易用性。多模式支持不同的输入方式，例如图像，文本，音频或视频。可扩展性允许添加新的算法和功能。该图书馆还旨在轻松理解和使用。此外，我们还引入了一种称为Captum Insights的交互式可视化工具，该工具构建在Captum库顶上，并允许使用功能重要性指标基于样本的模型调试和可视化。

In this paper we introduce a novel, unified, open-source model interpretability library for PyTorch [12]. The library contains generic implementations of a number of gradient and perturbation-based attribution algorithms, also known as feature, neuron and layer importance algorithms, as well as a set of evaluation metrics for these algorithms. It can be used for both classification and non-classification models including graph-structured models built on Neural Networks (NN). In this paper we give a high-level overview of supported attribution algorithms and show how to perform memory-efficient and scalable computations. We emphasize that the three main characteristics of the library are multimodality, extensibility and ease of use. Multimodality supports different modality of inputs such as image, text, audio or video. Extensibility allows adding new algorithms and features. The library is also designed for easy understanding and use. Besides, we also introduce an interactive visualization tool called Captum Insights that is built on top of Captum library and allows sample-based model debugging and visualization using feature importance metrics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题