论文标题
迈向高级分析中合成数据的分类法
Towards a Taxonomy for the Use of Synthetic Data in Advanced Analytics
论文作者
论文摘要
深度学习技术的扩散导致在重要业务领域(例如预测性维护或产品建议)中广泛的高级分析应用程序。但是,由于高级分析的有效性自然取决于足够数据的可用性,因此组织利用收益的能力可能受到有限的数据或同样数据访问的限制。这些挑战可能会迫使组织在数据上花费大量资金,接受受限制的分析能力,甚至可以变成分析项目的震撼力。在这种背景下,产生合成数据的深度学习的最新进展可能有助于克服这些障碍。尽管有很大的潜力,但很少使用合成数据。因此,我们提出了一种分类学,突出了为高级分析系统部署合成数据的各个方面。此外,我们确定了合成数据的典型应用程序方案,以评估当前采用状态,从而揭示了错过的机会,为进一步的研究铺平了道路。
The proliferation of deep learning techniques led to a wide range of advanced analytics applications in important business areas such as predictive maintenance or product recommendation. However, as the effectiveness of advanced analytics naturally depends on the availability of sufficient data, an organization's ability to exploit the benefits might be restricted by limited data or likewise data access. These challenges could force organizations to spend substantial amounts of money on data, accept constrained analytics capacities, or even turn into a showstopper for analytics projects. Against this backdrop, recent advances in deep learning to generate synthetic data may help to overcome these barriers. Despite its great potential, however, synthetic data are rarely employed. Therefore, we present a taxonomy highlighting the various facets of deploying synthetic data for advanced analytics systems. Furthermore, we identify typical application scenarios for synthetic data to assess the current state of adoption and thereby unveil missed opportunities to pave the way for further research.