论文标题

关于将单词嵌入和检索模型相结合的复制性

On the Replicability of Combining Word Embeddings and Retrieval Models

论文作者

Papariello, Luca, Bampoulidis, Alexandros, Lupu, Mihai

论文摘要

我们复制了最近的实验,试图证明使用Fisher内核框架和混合模型将单词嵌入到文档表示形式以及在文档分类,聚类和检索中使用这些表示形式的有吸引力的假设。具体而言,假设是使用von Mises-fisher(VMF)分布而不是高斯分布的混合模型将是有益的,因为关注VMF的余弦距离和传统上用于信息检索的矢量空间模型。以前的实验已经验证了这一假设。尽管参数扫描空间很大,我们的复制仍无法验证它。

We replicate recent experiments attempting to demonstrate an attractive hypothesis about the use of the Fisher kernel framework and mixture models for aggregating word embeddings towards document representations and the use of these representations in document classification, clustering, and retrieval. Specifically, the hypothesis was that the use of a mixture model of von Mises-Fisher (VMF) distributions instead of Gaussian distributions would be beneficial because of the focus on cosine distances of both VMF and the vector space model traditionally used in information retrieval. Previous experiments had validated this hypothesis. Our replication was not able to validate it, despite a large parameter scan space.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源