Aldi ++：用于构建能量负载曲线的自动和无参数的异常检测和异常值检测

论文标题

Aldi ++：用于构建能量负载曲线的自动和无参数的异常检测和异常值检测

ALDI++: Automatic and parameter-less discord and outlier detection for building energy load profiles

论文作者

Quintana, Matias, Stoeckmann, Till, Park, June Young, Turowski, Marian, Hagenmeyer, Veit, Miller, Clayton

论文摘要

数据驱动的建筑能源预测是测量和验证，构建基准测试和建筑物与网格相互作用的过程不可或缺的一部分。 Ashrae Great Energy预测器III（GEPIII）机器学习竞赛使用了广泛的仪表数据集，为整个建筑能源预测提供了最精确的机器学习工作流程。获胜解决方案的重要组成部分是删除异常训练数据的预处理阶段。当代预处理方法着重于过滤统计阈值值或需要培训数据和多个超参数的深度学习方法。一种名为ALDI的方法（自动加载配置文件不和谐标识）通过使用矩阵配置文件来识别这些不和谐，但是该技术仍然需要用户定义的参数。我们开发了Aldi ++，这是一种基于先前工作的方法，该方法绕过用户定义的参数并利用不和谐相似性。我们根据统计阈值，变异自动编码器和原始ALDI评估ALDI ++在分类不和谐和能量预测方案的基准。我们的结果表明，虽然对原始方法的分类性能提高是边缘性的，但ALDI ++有助于达到最佳预测错误，比获胜的团队方法以减少六倍的计算时间来提高6％。

Data-driven building energy prediction is an integral part of the process for measurement and verification, building benchmarking, and building-to-grid interaction. The ASHRAE Great Energy Predictor III (GEPIII) machine learning competition used an extensive meter data set to crowdsource the most accurate machine learning workflow for whole building energy prediction. A significant component of the winning solutions was the pre-processing phase to remove anomalous training data. Contemporary pre-processing methods focus on filtering statistical threshold values or deep learning methods requiring training data and multiple hyper-parameters. A recent method named ALDI (Automated Load profile Discord Identification) managed to identify these discords using matrix profile, but the technique still requires user-defined parameters. We develop ALDI++, a method based on the previous work that bypasses user-defined parameters and takes advantage of discord similarity. We evaluate ALDI++ against a statistical threshold, variational auto-encoder, and the original ALDI as baselines in classifying discords and energy forecasting scenarios. Our results demonstrate that while the classification performance improvement over the original method is marginal, ALDI++ helps achieve the best forecasting error improving 6% over the winning's team approach with six times less computation time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题