English

Rethinking Streaming Machine Learning Evaluation

Machine Learning 2022-05-24 v1 Artificial Intelligence Machine Learning

Abstract

While most work on evaluating machine learning (ML) models focuses on computing accuracy on batches of data, tracking accuracy alone in a streaming setting (i.e., unbounded, timestamp-ordered datasets) fails to appropriately identify when models are performing unexpectedly. In this position paper, we discuss how the nature of streaming ML problems introduces new real-world challenges (e.g., delayed arrival of labels) and recommend additional metrics to assess streaming ML performance.

Keywords

Cite

@article{arxiv.2205.11473,
  title  = {Rethinking Streaming Machine Learning Evaluation},
  author = {Shreya Shankar and Bernease Herman and Aditya G. Parameswaran},
  journal= {arXiv preprint arXiv:2205.11473},
  year   = {2022}
}

Comments

ML Evaluation Standards Workshop (ICLR 2022)