基于反对性学习的中国拼写检查框架

论文标题

基于反对性学习的中国拼写检查框架

A Chinese Spelling Check Framework Based on Reverse Contrastive Learning

论文作者

Lin, Nankai, Wu, Hongyan, Fu, Sihui, Jiang, Shengyi, Yang, Aimin

论文摘要

中文拼写检查是一项在中文文本中检测和纠正拼写错误的任务。现有的研究旨在增强文本表示并使用多源信息来提高模型的检测和校正功能，但并不高于提高其区分可混淆单词的能力。对比学习的目的是最大程度地减少相似样本对之间表示空间的距离，最近已成为自然语言处理中的主要技术。受对比学习的启发，我们提出了一个用于中国拼写检查的新颖框架，该框架由三个模块组成：语言表示，拼写检查和反向对比学习。具体而言，我们提出了一种反对对比的学习策略，该策略明确迫使模型最大程度地减少类似示例之间的一致性，即语音和视觉上令人困惑的字符。实验结果表明，我们的框架是模型不可静止的，可以与现有的中国拼写检查模型相结合，以产生最新的性能。

Chinese spelling check is a task to detect and correct spelling mistakes in Chinese text. Existing research aims to enhance the text representation and use multi-source information to improve the detection and correction capabilities of models, but does not pay too much attention to improving their ability to distinguish between confusable words. Contrastive learning, whose aim is to minimize the distance in representation space between similar sample pairs, has recently become a dominant technique in natural language processing. Inspired by contrastive learning, we present a novel framework for Chinese spelling checking, which consists of three modules: language representation, spelling check and reverse contrastive learning. Specifically, we propose a reverse contrastive learning strategy, which explicitly forces the model to minimize the agreement between the similar examples, namely, the phonetically and visually confusable characters. Experimental results show that our framework is model-agnostic and could be combined with existing Chinese spelling check models to yield state-of-the-art performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题