English

Fast Sequence Segmentation using Log-Linear Models

Data Structures and Algorithms 2019-02-12 v1

Abstract

Sequence segmentation is a well-studied problem, where given a sequence of elements, an integer K, and some measure of homogeneity, the task is to split the sequence into K contiguous segments that are maximally homogeneous. A classic approach to find the optimal solution is by using a dynamic program. Unfortunately, the execution time of this program is quadratic with respect to the length of the input sequence. This makes the algorithm slow for a sequence of non-trivial length. In this paper we study segmentations whose measure of goodness is based on log-linear models, a rich family that contains many of the standard distributions. We present a theoretical result allowing us to prune many suboptimal segmentations. Using this result, we modify the standard dynamic program for one-dimensional log-linear models, and by doing so reduce the computational time. We demonstrate empirically, that this approach can significantly reduce the computational burden of finding the optimal segmentation.

Keywords

Cite

@article{arxiv.1902.03285,
  title  = {Fast Sequence Segmentation using Log-Linear Models},
  author = {Nikolaj Tatti},
  journal= {arXiv preprint arXiv:1902.03285},
  year   = {2019}
}