用于大规模知识图嵌入的硬件 - 敏捷计算

论文标题

用于大规模知识图嵌入的硬件 - 敏捷计算

Hardware-agnostic Computation for Large-scale Knowledge Graph Embeddings

论文作者

Demir, Caglar, Ngomo, Axel-Cyrille Ngonga

论文摘要

知识图嵌入研究主要集中在学习知识图的连续表示方面涉及链接预测问题。最近开发的框架可以有效地应用于研究相关的应用中。但是，这些框架无法满足现实应用程序的许多要求。随着知识图的大小的增长，在这些框架中，将计算从商品计算机转移到一组计算机变得更具挑战性。查找合适的高参数设置W.R.T.时间和计算预算留给从业者。此外，尽管不断学习在许多现实世界（深）学习驱动的应用中，持续学习在知识图嵌入框架中的持续学习方面通常被忽略。可以说，这些局限性解释了缺乏大型知识图的公开知识图嵌入模型。我们以框架的框架，Pytorch Lightning和拥抱面的框架开发了一个框架，以用硬件 - 不合Snostic的方式计算大规模知识图的嵌入，这能够解决与真实应用程序规模有关的现实挑战。我们提供了框架的开源版本以及具有超过11.4 B参数的预训练模型的枢纽。

Knowledge graph embedding research has mainly focused on learning continuous representations of knowledge graphs towards the link prediction problem. Recently developed frameworks can be effectively applied in research related applications. Yet, these frameworks do not fulfill many requirements of real-world applications. As the size of the knowledge graph grows, moving computation from a commodity computer to a cluster of computers in these frameworks becomes more challenging. Finding suitable hyperparameter settings w.r.t. time and computational budgets are left to practitioners. In addition, the continual learning aspect in knowledge graph embedding frameworks is often ignored, although continual learning plays an important role in many real-world (deep) learning-driven applications. Arguably, these limitations explain the lack of publicly available knowledge graph embedding models for large knowledge graphs. We developed a framework based on the frameworks DASK, Pytorch Lightning and Hugging Face to compute embeddings for large-scale knowledge graphs in a hardware-agnostic manner, which is able to address real-world challenges pertaining to the scale of real application. We provide an open-source version of our framework along with a hub of pre-trained models having more than 11.4 B parameters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题