重新思考流媒体机学习评估

论文标题

重新思考流媒体机学习评估

Rethinking Streaming Machine Learning Evaluation

论文作者

Shankar, Shreya, Herman, Bernease, Parameswaran, Aditya G.

论文摘要

虽然大多数评估机器学习（ML）模型的工作侧重于计算数据批次的准确性，但单独跟踪流媒体设置（即，无限制的，时间戳订购的数据集）的准确性未能适当地识别模型何时表现出乎意料。在该职位论文中，我们讨论了流媒体问题的性质如何引入新的现实世界挑战（例如，标签延迟到达），并建议其他指标来评估流媒体ML性能。

While most work on evaluating machine learning (ML) models focuses on computing accuracy on batches of data, tracking accuracy alone in a streaming setting (i.e., unbounded, timestamp-ordered datasets) fails to appropriately identify when models are performing unexpectedly. In this position paper, we discuss how the nature of streaming ML problems introduces new real-world challenges (e.g., delayed arrival of labels) and recommend additional metrics to assess streaming ML performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题