从扩展的搜索空间学习语言建模的架构

论文标题

从扩展的搜索空间学习语言建模的架构

Learning Architectures from an Extended Search Space for Language Modeling

论文作者

Li, Yinqiao, Hu, Chi, Zhang, Yuhao, Xu, Nuo, Jiang, Yufan, Xiao, Tong, Zhu, Jingbo, Liu, Tongran, Li, Changliang

论文摘要

近年来，神经体系结构搜索（NAS）已取得了显着发展，但是大多数NAS系统都将搜索限制为学习反复或卷积细胞的架构。在本文中，我们扩展了NAS的搜索空间。特别是，我们提出了一种一般的方法来学习内部细胞和间元素间体系结构（称其为ESS）。为了获得更好的搜索结果，我们设计了一种联合学习方法，以同时执行细胞内和细胞间NA。我们在可区分的体系结构搜索系统中实现了我们的模型。对于经常性的神经语言建模，它在PTB和Wikitext数据上的基线表现明显优于PTB上的新最先进的基线。此外，学习的架构显示出对其他系统的良好可传递性。例如，它们改善了conll和Wnut命名实体识别（NER）任务和conll块任务的最新系统，这表明对大型预度架构进行了有希望的研究。

Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题