Sparcassist：基于稀疏产生的反事实的模型风险评估助手

论文标题

Sparcassist：基于稀疏产生的反事实的模型风险评估助手

SparCAssist: A Model Risk Assessment Assistant Based on Sparse Generated Counterfactuals

论文作者

Zhang, Zijian, Setty, Vinay, Anand, Avishek

论文摘要

我们介绍了Sparcassist，这是一种用于语言任务的机器学习模型的通用风险评估工具。它通过检查其对反事实的行为来评估模型的风险，即基于给定的数据实例生成的分发实例。反事实是通过在经过exp的有理子序列中替换令牌来生成的，而使用基于热弹或基于蒙版的基于基于语言模型的算法则检索替换。我们系统的主要目的是帮助人类注释者评估模型的部署风险。评估期间产生的反事实是副产品，可以在将来训练更强大的NLP模型。

We introduce SparcAssist, a general-purpose risk assessment tool for the machine learning models trained for language tasks. It evaluates models' risk by inspecting their behavior on counterfactuals, namely out-of-distribution instances generated based on the given data instance. The counterfactuals are generated by replacing tokens in rational subsequences identified by ExPred, while the replacements are retrieved using HotFlip or Masked-Language-Model-based algorithms. The main purpose of our system is to help the human annotators to assess the model's risk on deployment. The counterfactual instances generated during the assessment are the by-product and can be used to train more robust NLP models in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题