Related papers: Memory Limited, Streaming PCA
In this paper we propose a new algorithm for streaming principal component analysis. With limited memory, small devices cannot store all the samples in the high-dimensional regime. Streaming principal component analysis aims to find the…
Principal Component Analysis (PCA) is a widely used technique in machine learning, data analysis and signal processing. With the increase in the size and complexity of datasets, it has become important to develop low-space usage algorithms…
Principal component analysis (PCA) is a fundamental dimension reduction tool in statistics and machine learning. For large and high-dimensional data, computing the PCA (i.e., the singular vectors corresponding to a number of dominant…
Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the ``large $p$, small $n$''…
Principal component analysis (PCA) aims at estimating the direction of maximal variability of a high-dimensional dataset. A natural question is: does this task become easier, and estimation more accurate, when we exploit additional…
In this paper, we propose an acceleration scheme for online memory-limited PCA methods. Our scheme converges to the first $k>1$ eigenvectors in a single data pass. We provide empirical convergence results of our scheme based on the spiked…
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…
Fair Principal Component Analysis (PCA) is a problem setting where we aim to perform PCA while making the resulting representation fair in that the projected distributions, conditional on the sensitive attributes, match one another.…
In the current context of data explosion, online techniques that do not require storing all data in memory are indispensable to routinely perform tasks like principal component analysis (PCA). Recursive algorithms that update the PCA with…
Oja's algorithm for Streaming Principal Component Analysis (PCA) for $n$ data-points in a $d$ dimensional space achieves the same sin-squared error $O(r_{\mathsf{eff}}/n)$ as the offline algorithm in $O(d)$ space and $O(nd)$ time and a…
We study the problem of recovering the subspace spanned by the first $k$ principal components of $d$-dimensional data under the streaming setting, with a memory bound of $O(kd)$. Two families of algorithms are known for this problem. The…
Principal Component Analysis (PCA) has been widely used for dimensionality reduction and feature extraction. Robust PCA (RPCA), under different robust distance metrics, such as l1-norm and l2, p-norm, can deal with noise or outliers to some…
Sparse Principal Component Analysis (PCA) methods are efficient tools to reduce the dimension (or the number of variables) of complex data. Sparse principal components (PCs) are easier to interpret than conventional PCs, because most…
Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…
Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of…
Principal component analysis (PCA) is one of the most powerful tools in machine learning. The simplest method for PCA, the power iteration, requires $\mathcal O(1/\Delta)$ full-data passes to recover the principal component of a matrix with…
Mining useful clusters from high dimensional data has received significant attention of the computer vision and pattern recognition community in the recent years. Linear and non-linear dimensionality reduction has played an important role…
For many modern applications in science and engineering, data are collected in a streaming fashion carrying time-varying information, and practitioners need to process them with a limited amount of memory and computational resources in a…
Principal component analysis (PCA) is a widely used unsupervised dimensionality reduction technique in machine learning, applied across various fields such as bioinformatics, computer vision and finance. However, when the response variables…
We present a federated, asynchronous, and $(\varepsilon, \delta)$-differentially private algorithm for PCA in the memory-limited setting. Our algorithm incrementally computes local model updates using a streaming procedure and adaptively…