Plue：英语隐私政策的语言理解评估基准

论文标题

Plue：英语隐私政策的语言理解评估基准

PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English

论文作者

Chi, Jianfeng, Ahmad, Wasi Uddin, Tian, Yuan, Chang, Kai-Wei

论文摘要

隐私政策为个人提供有关其权利以及如何处理其个人信息的信息。自然语言理解（NLU）技术可以支持个人和从业者了解冗长而复杂的文档中描述的更好的隐私习惯。但是，使用NLU技术的现有努力是通过以一种专门针对某些隐私惯例的单一任务来处理语言来限制的。为此，我们介绍了隐私政策语言理解评估（PLUE）基准，这是一种多任务基准，用于评估跨各种任务的隐私政策语言理解。我们还收集了大量的隐私政策，以实现特定于特定于语言的隐私政策预培训。我们评估了几种通用的预训练语言模型，并继续在收集的语料库上预先培训它们。我们证明，特定于域的持续预训练可提供所有任务的性能改进。

Privacy policies provide individuals with information about their rights and how their personal information is handled. Natural language understanding (NLU) technologies can support individuals and practitioners to understand better privacy practices described in lengthy and complex documents. However, existing efforts that use NLU technologies are limited by processing the language in a way exclusive to a single task focusing on certain privacy practices. To this end, we introduce the Privacy Policy Language Understanding Evaluation (PLUE) benchmark, a multi-task benchmark for evaluating the privacy policy language understanding across various tasks. We also collect a large corpus of privacy policies to enable privacy policy domain-specific language model pre-training. We evaluate several generic pre-trained language models and continue pre-training them on the collected corpus. We demonstrate that domain-specific continual pre-training offers performance improvements across all tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题