English

Reductions for Frequency-Based Data Mining Problems

Computational Complexity 2017-09-05 v1

Abstract

Studying the computational complexity of problems is one of the - if not the - fundamental questions in computer science. Yet, surprisingly little is known about the computational complexity of many central problems in data mining. In this paper we study frequency-based problems and propose a new type of reduction that allows us to compare the complexities of the maximal frequent pattern mining problems in different domains (e.g. graphs or sequences). Our results extend those of Kimelfeld and Kolaitis [ACM TODS, 2014] to a broader range of data mining problems. Our results show that, by allowing constraints in the pattern space, the complexities of many maximal frequent pattern mining problems collapse. These problems include maximal frequent subgraphs in labelled graphs, maximal frequent itemsets, and maximal frequent subsequences with no repetitions. In addition to theoretical interest, our results might yield more efficient algorithms for the studied problems.

Keywords

Cite

@article{arxiv.1709.00900,
  title  = {Reductions for Frequency-Based Data Mining Problems},
  author = {Stefan Neumann and Pauli Miettinen},
  journal= {arXiv preprint arXiv:1709.00900},
  year   = {2017}
}

Comments

This is an extended version of a paper of the same title to appear in the Proceedings of the 17th IEEE International Conference on Data Mining (ICDM'17)