论文标题
学习节点嵌入时,注意会提高集中度
Attention improves concentration when learning node embeddings
论文作者
论文摘要
我们考虑了电子商务设置中节点属性中图表中预测边缘的问题。具体而言,给定标有搜索查询文本的节点,我们希望预测共享产品相关查询的链接。具有一系列深神经体系结构的实验表明,具有注意机制的简单前馈网络最适合学习嵌入。这些模型的简单性使我们能够解释注意力的表现。 我们提出了一个可分析的查询生成模型,证明,该模型将产品和查询文本视为嵌入潜在空间中的向量。我们证明(并从经验上)证明,在证明查询文本嵌入的点上的点相互信息(PMI)矩阵显示出类似于单词嵌入中观察到的低级别行为。这种低级属性使我们能够得出一个损失函数,该函数可最大程度地提高相关查询之间的相互信息,该查询用于训练注意网络以学习查询嵌入。该证明网络以F-1分数击败了传统的基于内存的LSTM架构超过20%。我们通过表明注意机制的权重与产品向量的最佳线性无偏估计器(蓝色)的权重密切相关,并得出结论,注意力在降低方差中起重要作用。
We consider the problem of predicting edges in a graph from node attributes in an e-commerce setting. Specifically, given nodes labelled with search query text, we want to predict links to related queries that share products. Experiments with a range of deep neural architectures show that simple feedforward networks with an attention mechanism perform best for learning embeddings. The simplicity of these models allows us to explain the performance of attention. We propose an analytically tractable model of query generation, AttEST, that views both products and the query text as vectors embedded in a latent space. We prove (and empirically validate) that the point-wise mutual information (PMI) matrix of the AttEST query text embeddings displays a low-rank behavior analogous to that observed in word embeddings. This low-rank property allows us to derive a loss function that maximizes the mutual information between related queries which is used to train an attention network to learn query embeddings. This AttEST network beats traditional memory-based LSTM architectures by over 20% on F-1 score. We justify this out-performance by showing that the weights from the attention mechanism correlate strongly with the weights of the best linear unbiased estimator (BLUE) for the product vectors, and conclude that attention plays an important role in variance reduction.