论文标题
使用树自动编码器对数据驱动的话语结构的无监督推理
Unsupervised Inference of Data-Driven Discourse Structures using a Tree Auto-Encoder
论文作者
论文摘要
在许多下游任务和现实应用程序中,人们对强大和一般的话语结构的需求日益增长,目前缺乏高质量的高质量话语树木构成了严重的缺点。为了减轻这一限制,我们提出了一种新的策略,通过扩展具有自动编码目标的潜在树归纳框架,以任务不合时宜的,无监督的方式生成树结构。提出的方法可以应用于任何树结构的目标,例如句法解析,话语解析等。但是,由于产生话语树的特别困难注释过程,我们最初开发了这种方法来补充特定于任务的模型,以生成更大,更多样化的话语树库。
With a growing need for robust and general discourse structures in many downstream tasks and real-world applications, the current lack of high-quality, high-quantity discourse trees poses a severe shortcoming. In order the alleviate this limitation, we propose a new strategy to generate tree structures in a task-agnostic, unsupervised fashion by extending a latent tree induction framework with an auto-encoding objective. The proposed approach can be applied to any tree-structured objective, such as syntactic parsing, discourse parsing and others. However, due to the especially difficult annotation process to generate discourse trees, we initially develop such method to complement task-specific models in generating much larger and more diverse discourse treebanks.