English

Orthogonal Random Features

Machine Learning 2016-10-31 v1 Machine Learning

Abstract

We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we further propose Structured Orthogonal Random Features (SORF), which uses a class of structured discrete orthogonal matrices to speed up the computation. The method reduces the time cost from O(d2)\mathcal{O}(d^2) to O(dlogd)\mathcal{O}(d \log d), where dd is the data dimensionality, with almost no compromise in kernel approximation quality compared to ORF. Experiments on several datasets verify the effectiveness of ORF and SORF over the existing methods. We also provide discussions on using the same type of discrete orthogonal structure for a broader range of applications.

Keywords

Cite

@article{arxiv.1610.09072,
  title  = {Orthogonal Random Features},
  author = {Felix X. Yu and Ananda Theertha Suresh and Krzysztof Choromanski and Daniel Holtmann-Rice and Sanjiv Kumar},
  journal= {arXiv preprint arXiv:1610.09072},
  year   = {2016}
}

Comments

NIPS 2016