论文标题
机器解释和人类理解
Machine Explanations and Human Understanding
论文作者
论文摘要
假设解释是为了提高人类对机器学习模型的理解并取得各种理想的结果,从模型调试到增强人类决策。但是,经验研究发现混合甚至负面的结果。因此,一个开放的问题是在什么条件下解释可以改善人类的理解以及以什么方式。使用改编的因果图,我们提供了机器解释与人类理解之间相互作用的形式表征,并展示了人类直觉如何在实现人类理解中起着核心作用。具体来说,我们确定了感兴趣的三个核心概念,这些核心概念涵盖了人类AI决策的背景下的所有现有量化衡量标准:任务决策边界,模型决策边界和模型错误。我们的关键结果是,如果没有关于任务特定直觉的假设,解释可能会有可能提高人类对模型决策边界的理解,但是它们无法改善人类对任务决策边界或模型错误的理解。为了实现互补的人类表现,我们阐明了可能如何与人类直觉一起使用的解释。例如,人类关于特征相关性的直觉(例如,教育比预测一个人的收入更重要)对于检测模型错误至关重要。我们验证了人类直觉在通过经验人类受试者研究中塑造机器解释结果的重要性。总体而言,我们的工作提供了一个一般框架,以及对未来算法开发和机器解释的经验实验的可行含义。
Explanations are hypothesized to improve human understanding of machine learning models and achieve a variety of desirable outcomes, ranging from model debugging to enhancing human decision making. However, empirical studies have found mixed and even negative results. An open question, therefore, is under what conditions explanations can improve human understanding and in what way. Using adapted causal diagrams, we provide a formal characterization of the interplay between machine explanations and human understanding, and show how human intuitions play a central role in enabling human understanding. Specifically, we identify three core concepts of interest that cover all existing quantitative measures of understanding in the context of human-AI decision making: task decision boundary, model decision boundary, and model error. Our key result is that without assumptions about task-specific intuitions, explanations may potentially improve human understanding of model decision boundary, but they cannot improve human understanding of task decision boundary or model error. To achieve complementary human-AI performance, we articulate possible ways on how explanations need to work with human intuitions. For instance, human intuitions about the relevance of features (e.g., education is more important than age in predicting a person's income) can be critical in detecting model error. We validate the importance of human intuitions in shaping the outcome of machine explanations with empirical human-subject studies. Overall, our work provides a general framework along with actionable implications for future algorithmic development and empirical experiments of machine explanations.