English
Related papers

Related papers: Streaming Kernel PCA Algorithm With Small Space

200 papers

In this paper we propose a new algorithm for streaming principal component analysis. With limited memory, small devices cannot store all the samples in the high-dimensional regime. Streaming principal component analysis aims to find the…

Machine Learning · Statistics 2018-02-16 Puyudi Yang , Cho-Jui Hsieh , Jane-Ling Wang

Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning. To allow for non-linear relations, typically…

Data Structures and Algorithms · Computer Science 2015-12-17 Mina Ghashami , Daniel Perry , Jeff M. Phillips

For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a…

Machine Learning · Statistics 2018-06-13 Laura Balzano , Yuejie Chi , Yue M. Lu

Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a…

Statistics Theory · Mathematics 2025-03-12 Syamantak Kumar , Purnamrita Sarkar

This work provides improved guarantees for streaming principle component analysis (PCA). Given $A_1, \ldots, A_n\in \mathbb{R}^{d\times d}$ sampled independently from distributions satisfying $\mathbb{E}[A_i] = \Sigma$ for $\Sigma \succeq…

Machine Learning · Computer Science 2016-03-29 Prateek Jain , Chi Jin , Sham M. Kakade , Praneeth Netrapalli , Aaron Sidford

Since its inception in 1982, Oja's algorithm has become an established method for streaming principle component analysis (PCA). We study the problem of streaming PCA, where the data-points are sampled from an irreducible, aperiodic, and…

Statistics Theory · Mathematics 2023-06-21 Syamantak Kumar , Purnamrita Sarkar

Low-precision streaming PCA estimates the top principal component in a streaming setting under limited precision. We establish an information-theoretic lower bound on the quantization resolution required to achieve a target accuracy for the…

Machine Learning · Computer Science 2025-10-28 Sanjoy Dasgupta , Syamantak Kumar , Shourya Pandey , Purnamrita Sarkar

We study streaming principal component analysis (PCA), that is to find, in $O(dk)$ space, the top $k$ eigenvectors of a $d\times d$ hidden matrix $\bf \Sigma$ with online vectors drawn from covariance matrix $\bf \Sigma$. We provide…

Optimization and Control · Mathematics 2017-04-18 Zeyuan Allen-Zhu , Yuanzhi Li

We consider streaming, one-pass principal component analysis (PCA), in the high-dimensional regime, with limited memory. Here, $p$-dimensional samples are presented sequentially, and the goal is to produce the $k$-dimensional subspace that…

Machine Learning · Statistics 2013-07-02 Ioannis Mitliagkas , Constantine Caramanis , Prateek Jain

Fair Principal Component Analysis (PCA) is a problem setting where we aim to perform PCA while making the resulting representation fair in that the projected distributions, conditional on the sensitive attributes, match one another.…

Machine Learning · Statistics 2023-10-31 Junghyun Lee , Hanseul Cho , Se-Young Yun , Chulhee Yun

We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix $A \in R^{m \times n},$ a rank parameter $k < rank(A)$, and an accuracy parameter $0 < \epsilon < 1$, we want to…

Data Structures and Algorithms · Computer Science 2016-07-13 Christos Boutsidis , David P. Woodruff , Peilin Zhong

We study the statistical and computational aspects of kernel principal component analysis using random Fourier features and show that under mild assumptions, $O(\sqrt{n} \log n)$ features suffices to achieve $O(1/\epsilon^2)$ sample…

Machine Learning · Computer Science 2018-11-19 Enayat Ullah , Poorya Mianjy , Teodor V. Marinov , Raman Arora

In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to routinely perform tasks like principal component analysis (PCA). Recursive algorithms that update the PCA with…

Machine Learning · Statistics 2015-11-13 Hervé Cardot , David Degras

Oja's algorithm has been the cornerstone of streaming methods in Principal Component Analysis (PCA) since it was first proposed in 1982. However, Oja's algorithm does not have a standardized choice of learning rate (step size) that both…

Machine Learning · Statistics 2019-11-04 Amelia Henriksen , Rachel Ward

Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…

Machine Learning · Statistics 2017-05-19 Xianghui Luo , Robert J. Durrant

Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…

Machine Learning · Computer Science 2021-11-30 Arpita Gang , Waheed U. Bajwa

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…

Machine Learning · Statistics 2019-05-20 Davoud Ataee Tarzanagh , Mohamad Kazem Shirani Faradonbeh , George Michailidis

In this paper we analyze the behavior of the Oja's algorithm for online/streaming principal component subspace estimation. It is proved that with high probability it performs an efficient, gap-free, global convergence rate to approximate an…

Machine Learning · Computer Science 2024-03-06 Xin Liang

We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing…

Statistics Theory · Mathematics 2025-07-22 Syamantak Kumar , Shourya Pandey , Purnamrita Sarkar

We consider streaming principal component analysis when the stochastic data-generating model is subject to perturbations. While existing models assume a fixed covariance, we adopt a robust perspective where the covariance matrix belongs to…

Machine Learning · Statistics 2022-10-13 Daniel Bienstock , Minchan Jeong , Apurv Shukla , Se-Young Yun
‹ Prev 1 2 3 10 Next ›