English

Cohort Query Processing

Databases 2016-05-05 v4

Abstract

Modern Internet applications often produce a large volume of user activity records. Data analysts are interested in cohort analysis, or finding unusual user behavioral trends, in these large tables of activity records. In a traditional database system, cohort analysis queries are both painful to specify and expensive to evaluate. We propose to extend database systems to support cohort analysis. We do so by extending SQL with three new operators. We devise three different evaluation schemes for cohort query processing. Two of them adopt a non-intrusive approach. The third approach employs a columnar based evaluation scheme with optimizations specifically designed for cohort query processing. Our experimental results confirm the performance benefits of our proposed columnar database system, compared against the two non-intrusive approaches that implement cohort queries on top of regular relational databases.

Keywords

Cite

@article{arxiv.1601.00182,
  title  = {Cohort Query Processing},
  author = {Dawei Jiang and Qingchao Cai and Gang Chen and H. V. Jagadish and Beng Chin Ooi and Kian-Lee Tan and Anthony K. H. Tung},
  journal= {arXiv preprint arXiv:1601.00182},
  year   = {2016}
}
R2 v1 2026-06-22T12:21:41.268Z