论文标题
从扩展的搜索空间学习语言建模的架构
Learning Architectures from an Extended Search Space for Language Modeling
论文作者
论文摘要
近年来,神经体系结构搜索(NAS)已取得了显着发展,但是大多数NAS系统都将搜索限制为学习反复或卷积细胞的架构。在本文中,我们扩展了NAS的搜索空间。特别是,我们提出了一种一般的方法来学习内部细胞和间元素间体系结构(称其为ESS)。为了获得更好的搜索结果,我们设计了一种联合学习方法,以同时执行细胞内和细胞间NA。我们在可区分的体系结构搜索系统中实现了我们的模型。对于经常性的神经语言建模,它在PTB和Wikitext数据上的基线表现明显优于PTB上的新最先进的基线。此外,学习的架构显示出对其他系统的良好可传递性。例如,它们改善了conll和Wnut命名实体识别(NER)任务和conll块任务的最新系统,这表明对大型预度架构进行了有希望的研究。
Neural architecture search (NAS) has advanced significantly in recent years but most NAS systems restrict search to learning architectures of a recurrent or convolutional cell. In this paper, we extend the search space of NAS. In particular, we present a general approach to learn both intra-cell and inter-cell architectures (call it ESS). For a better search result, we design a joint learning method to perform intra-cell and inter-cell NAS simultaneously. We implement our model in a differentiable architecture search system. For recurrent neural language modeling, it outperforms a strong baseline significantly on the PTB and WikiText data, with a new state-of-the-art on PTB. Moreover, the learned architectures show good transferability to other systems. E.g., they improve state-of-the-art systems on the CoNLL and WNUT named entity recognition (NER) tasks and CoNLL chunking task, indicating a promising line of research on large-scale pre-learned architectures.