论文标题
使用句子变形金刚自动审核的零击文本匹配
Zero-Shot Text Matching for Automated Auditing using Sentence Transformers
论文作者
论文摘要
自然语言处理方法在自动审核中有多个应用程序,包括文档或通过分类,信息检索和问答。但是,培训此类模型需要大量的注释数据,这些数据在工业环境中很少。同时,诸如零射击和无监督学习之类的技术允许使用通用域数据预先训练的模型来应用于看不见的域。 在这项工作中,我们通过将基于变压器的模型句句匹配的无监督文本匹配的效率应用于财务段落的语义相似性。实验结果表明,该模型对来自内域数据和外域数据的文档具有鲁棒性。
Natural language processing methods have several applications in automated auditing, including document or passage classification, information retrieval, and question answering. However, training such models requires a large amount of annotated data which is scarce in industrial settings. At the same time, techniques like zero-shot and unsupervised learning allow for application of models pre-trained using general domain data to unseen domains. In this work, we study the efficiency of unsupervised text matching using Sentence-Bert, a transformer-based model, by applying it to the semantic similarity of financial passages. Experimental results show that this model is robust to documents from in- and out-of-domain data.