Related papers: Streaming Kernel PCA Algorithm With Small Space
In this paper we propose a new algorithm for streaming principal component analysis. With limited memory, small devices cannot store all the samples in the high-dimensional regime. Streaming principal component analysis aims to find the…
Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning. To allow for non-linear relations, typically…
For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a…
Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a…
This work provides improved guarantees for streaming principle component analysis (PCA). Given $A_1, \ldots, A_n\in \mathbb{R}^{d\times d}$ sampled independently from distributions satisfying $\mathbb{E}[A_i] = \Sigma$ for $\Sigma \succeq…
Since its inception in 1982, Oja's algorithm has become an established method for streaming principle component analysis (PCA). We study the problem of streaming PCA, where the data-points are sampled from an irreducible, aperiodic, and…
Low-precision streaming PCA estimates the top principal component in a streaming setting under limited precision. We establish an information-theoretic lower bound on the quantization resolution required to achieve a target accuracy for the…
We study streaming principal component analysis (PCA), that is to find, in $O(dk)$ space, the top $k$ eigenvectors of a $d\times d$ hidden matrix $\bf \Sigma$ with online vectors drawn from covariance matrix $\bf \Sigma$. We provide…
We consider streaming, one-pass principal component analysis (PCA), in the high-dimensional regime, with limited memory. Here, $p$-dimensional samples are presented sequentially, and the goal is to produce the $k$-dimensional subspace that…
Fair Principal Component Analysis (PCA) is a problem setting where we aim to perform PCA while making the resulting representation fair in that the projected distributions, conditional on the sensitive attributes, match one another.…
We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix $A \in R^{m \times n},$ a rank parameter $k < rank(A)$, and an accuracy parameter $0 < \epsilon < 1$, we want to…
We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample…
In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to routinely perform tasks like principal component analysis (PCA). Recursive algorithms that update the PCA with…
Oja's algorithm has been the cornerstone of streaming methods in Principal Component Analysis (PCA) since it was first proposed in 1982. However, Oja's algorithm does not have a standardized choice of learning rate (step size) that both…
Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…
Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…
Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…
In this paper we analyze the behavior of the Oja's algorithm for online/streaming principal component subspace estimation. It is proved that with high probability it performs an efficient, gap-free, global convergence rate to approximate an…
We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing…
We consider streaming principal component analysis when the stochastic data-generating model is subject to perturbations. While existing models assume a fixed covariance, we adopt a robust perspective where the covariance matrix belongs to…