English

Optimizing I/O for Big Array Analytics

Databases 2012-04-30 v1

Abstract

Big array analytics is becoming indispensable in answering important scientific and business questions. Most analysis tasks consist of multiple steps, each making one or multiple passes over the arrays to be analyzed and generating intermediate results. In the big data setting, I/O optimization is a key to efficient analytics. In this paper, we develop a framework and techniques for capturing a broad range of analysis tasks expressible in nested-loop forms, representing them in a declarative way, and optimizing their I/O by identifying sharing opportunities. Experiment results show that our optimizer is capable of finding execution plans that exploit nontrivial I/O sharing opportunities with significant savings.

Keywords

Cite

@article{arxiv.1204.6081,
  title  = {Optimizing I/O for Big Array Analytics},
  author = {Yi Zhang and Jun Yang},
  journal= {arXiv preprint arXiv:1204.6081},
  year   = {2012}
}

Comments

VLDB2012

R2 v1 2026-06-21T20:55:26.262Z