AAE：用于改善图形存储的主动自动估计器

论文标题

AAE：用于改善图形存储的主动自动估计器

AAE: An Active Auto-Estimator for Improving Graph Storage

论文作者

Yan, Yu, Yang, Man, Wang, Hongzhi, Wang, Yuzhuo

论文摘要

如今，图成为许多实际应用中越来越流行的模型。图表存储的效率对于这些应用至关重要。一般而言，图形存储的调整任务依赖于数据库管理员（DBA）来找到最佳的图形存储。但是，DBA主要依靠其经验和直觉来做出调整决定。由于DBA的经历的局限性，这些曲调的性能可能不确定并提高效率较差。在本文中，我们观察到图形工作负载的估计器有潜力保证曲调操作的性能。不幸的是，由于图评估任务的复杂特征，没有用于图形工作负载的成熟估计器。我们将图形工作负载的评估任务作为分类任务，并仔细设计功能工程过程，包括图形数据功能，图形工作负载功能和图形存储功能。考虑到图形的复杂功能以及Graph Workload执行中的大量时间消耗，图形工作负载估算器很难获得足够的训练集。因此，我们通过组合主动学习和深度学习，提出了一个主动自动估计器（AAE），以进行图形工作负载评估。 AAE可以通过有限的培训集实现良好的评估效率。我们使用两个开源图数据（LDBC和Freebase）测试AAE的时间效率和评估准确性。实验结果表明，我们的估计器可以以毫秒的速度有效地完成图表工作负载评估。

Nowadays, graph becomes an increasingly popular model in many real applications. The efficiency of graph storage is crucial for these applications. Generally speaking, the tune tasks of graph storage rely on the database administrators (DBAs) to find the best graph storage. However, DBAs make the tune decisions by mainly relying on their experiences and intuition. Due to the limitations of DBAs's experiences, the tunes may have an uncertain performance and conduct worse efficiency. In this paper, we observe that an estimator of graph workload has the potential to guarantee the performance of tune operations. Unfortunately, because of the complex characteristics of graph evaluation task, there exists no mature estimator for graph workload. We formulate the evaluation task of graph workload as a classification task and carefully design the feature engineering process, including graph data features, graph workload features and graph storage features. Considering the complex features of graph and the huge time consumption in graph workload execution, it is difficult for the graph workload estimator to obtain enough training set. So, we propose an active auto-estimator (AAE) for the graph workload evaluation by combining the active learning and deep learning. AAE could achieve good evaluation efficiency with limited training set. We test the time efficiency and evaluation accuracy of AAE with two open source graph data, LDBC and Freebase. Experimental results show that our estimator could efficiently complete the graph workload evaluation in milliseconds.

下载PDF全文

下载文献需遵守相关版权规定

论文标题