Benchmarking Learned Indexes

Ryan Marcus; Andreas Kipf; Alexander van Renen; Mihail Stoian; Sanchit Misra; Alfons Kemper; Thomas Neumann; Tim Kraska

doi:10.14778/3421424.3421425

Benchmarking Learned Indexes

Databases 2023-03-28 v2

Authors: Ryan Marcus , Andreas Kipf , Alexander van Renen , Mihail Stoian , Sanchit Misra , Alfons Kemper , Thomas Neumann , Tim Kraska

View on arXiv ↗ PDF ↗ DOI ↗

Abstract

Recent advancements in learned index structures propose replacing existing index structures, like B-Trees, with approximate learned models. In this work, we present a unified benchmark that compares well-tuned implementations of three learned index structures against several state-of-the-art "traditional" baselines. Using four real-world datasets, we demonstrate that learned index structures can indeed outperform non-learned indexes in read-only in-memory workloads over a dense array. We also investigate the impact of caching, pipelining, dataset size, and key size. We study the performance profile of learned index structures, and build an explanation for why learned models achieve such good performance. Finally, we investigate other important properties of learned index structures, such as their performance in multi-threaded systems and their build times.

Keywords

machine learning machine learning theory decision tree

Cite

@article{arxiv.2006.12804,
  title  = {Benchmarking Learned Indexes},
  author = {Ryan Marcus and Andreas Kipf and Alexander van Renen and Mihail Stoian and Sanchit Misra and Alfons Kemper and Thomas Neumann and Tim Kraska},
  journal= {arXiv preprint arXiv:2006.12804},
  year   = {2023}
}

Benchmarking Learned Indexes

Abstract

Keywords

Cite

Related papers