知识是力量：理解因果关系使法律判断预测模型更具普遍和强大

论文标题

知识是力量：理解因果关系使法律判断预测模型更具普遍和强大

Knowledge is Power: Understanding Causality Makes Legal judgment Prediction Models More Generalizable and Robust

论文作者

Chen, Haotian, Zhang, Lingwei, Liu, Yiran, Chen, Fanchao, Yu, Yang

论文摘要

法律判决预测（LJP）旨在根据法治根据事实描述进行预测的判决，作为法律援助，以减轻有限法律从业者的巨大工作负担。大多数现有方法应用在LJP任务中填充的各种大规模预训练的语言模型（PLM），以获得一致的改进。但是，我们发现了一个事实，即最先进的（SOTA）模型根据不相关（或非休闲）信息做出判断预测。违反法治的行为不仅削弱了模型的鲁棒性和泛化能力，而且还会导致诸如歧视之类的严重社会问题。在本文中，我们使用因果结构模型（SCM）来理论上分析LJP模型如何学会做出决策以及为什么他们可以成功地传递传统的测试范式而无需学习因果关系。根据我们的分析，我们分别通过因果关系提供了两种解决数据和模型的解决方案。详细说明，我们首先通过应用开放信息提取（OIE）技术来区分非毒物信息。然后，我们提出了一种名为“因果信息增强的抽样方法”（CIESAM）的方法，以消除数据中的非毒物信息。为了验证我们的理论分析，我们进一步提出了另一种方法，使用我们提出的因果关系自我注意的机制（CASAM）指导模型学习法律文本中的基本因果关系。 CASAM在学习因果信息中的信心高于CIESAM。广泛的实验结果表明，我们提出的两种方法都在三个常用的特定于法律特定数据集上实现了最先进的（SOTA）性能。 CASAM的更强表现进一步表明，因果关系是模型鲁棒性和概括能力的关键。

Legal Judgment Prediction (LJP), aiming to predict a judgment based on fact descriptions according to rule of law, serves as legal assistance to mitigate the great work burden of limited legal practitioners. Most existing methods apply various large-scale pre-trained language models (PLMs) finetuned in LJP tasks to obtain consistent improvements. However, we discover the fact that the state-of-the-art (SOTA) model makes judgment predictions according to irrelevant (or non-casual) information. The violation of rule of law not only weakens the robustness and generalization ability of models but also results in severe social problems like discrimination. In this paper, we use causal structural models (SCMs) to theoretically analyze how LJP models learn to make decisions and why they can succeed in passing the traditional testing paradigm without learning causality. According to our analysis, we provide two solutions intervening on data and model by causality, respectively. In detail, we first distinguish non-causal information by applying the open information extraction (OIE) technique. Then, we propose a method named the Causal Information Enhanced SAmpling Method (CIESAM) to eliminate the non-causal information from data. To validate our theoretical analysis, we further propose another method using our proposed Causality-Aware Self-Attention Mechanism (CASAM) to guide the model to learn the underlying causality knowledge in legal texts. The confidence of CASAM in learning causal information is higher than that of CIESAM. The extensive experimental results show that both our proposed methods achieve state-of-the-art (SOTA) performance on three commonly used legal-specific datasets. The stronger performance of CASAM further demonstrates that causality is the key to the robustness and generalization ability of models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题