Hashing-Based-Estimators for Kernel Density in High Dimensions
Abstract
Given a set of points and a kernel , the Kernel Density Estimate at a point is defined as . We study the problem of designing a data structure that given a data set and a kernel function, returns *approximations to the kernel density* of a query point in *sublinear time*. We introduce a class of unbiased estimators for kernel density implemented through locality-sensitive hashing, and give general theorems bounding the variance of such estimators. These estimators give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.
Cite
@article{arxiv.1808.10530,
title = {Hashing-Based-Estimators for Kernel Density in High Dimensions},
author = {Moses Charikar and Paris Siminelakis},
journal= {arXiv preprint arXiv:1808.10530},
year = {2018}
}
Comments
A preliminary version of this paper appeared in FOCS 2017