论文标题
改善文本生成的变性自动编码器,以及离散的潜在瓶颈
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck
论文作者
论文摘要
变异自动编码器(VAE)是端到端表示学习中的基本工具。但是,连续的文本生成常见的陷阱是通过VAE的,该模型倾向于忽略具有强大自动回归解码器的潜在变量。在本文中,我们提出了一种原则性的方法来通过应用离散的瓶颈来减轻此问题,以在更紧凑的潜在空间中执行隐性潜在特征。我们强加了一个共享的离散潜在空间,其中每个输入都可以选择潜在原子作为正则潜在表示的组合。我们的模型赋予了模拟离散序列的基础语义的有希望的能力,从而提供了更多的解释性潜在结构。从经验上讲,我们在广泛的任务上展示了模型的效率和有效性,包括语言建模,未对齐的文本样式转移,对话响应生成和神经机器翻译。
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. However, the sequential text generation common pitfall with VAEs is that the model tends to ignore latent variables with a strong auto-regressive decoder. In this paper, we propose a principled approach to alleviate this issue by applying a discretized bottleneck to enforce an implicit latent feature matching in a more compact latent space. We impose a shared discrete latent space where each input is learned to choose a combination of latent atoms as a regularized latent representation. Our model endows a promising capability to model underlying semantics of discrete sequences and thus provide more interpretative latent structures. Empirically, we demonstrate our model's efficiency and effectiveness on a broad range of tasks, including language modeling, unaligned text style transfer, dialog response generation, and neural machine translation.