English

Optimized Algorithms to Sample Determinantal Point Processes

Computation 2018-02-26 v1 Machine Learning Machine Learning

Abstract

In this technical report, we discuss several sampling algorithms for Determinantal Point Processes (DPP). DPPs have recently gained a broad interest in the machine learning and statistics literature as random point processes with negative correlation, i.e., ones that can generate a "diverse" sample from a set of items. They are parametrized by a matrix L\mathbf{L}, called LL-ensemble, that encodes the correlations between items. The standard sampling algorithm is separated in three phases: 1/~eigendecomposition of L\mathbf{L}, 2/~an eigenvector sampling phase where L\mathbf{L}'s eigenvectors are sampled independently via a Bernoulli variable parametrized by their associated eigenvalue, 3/~a Gram-Schmidt-type orthogonalisation procedure of the sampled eigenvectors. In a naive implementation, the computational cost of the third step is on average O(Nμ3)\mathcal{O}(N\mu^3) where μ\mu is the average number of samples of the DPP. We give an algorithm which runs in O(Nμ2)\mathcal{O}(N\mu^2) and is extremely simple to implement. If memory is a constraint, we also describe a dual variant with reduced memory costs. In addition, we discuss implementation details often missing in the literature.

Keywords

Cite

@article{arxiv.1802.08471,
  title  = {Optimized Algorithms to Sample Determinantal Point Processes},
  author = {Nicolas Tremblay and Simon Barthelme and Pierre-Olivier Amblard},
  journal= {arXiv preprint arXiv:1802.08471},
  year   = {2018}
}