Related papers: Improved Distributed Principal Component Analysis
Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often thought of as a dimensionality reduction method, the purpose of PCA is actually two-fold: dimension reduction…
Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…
Kernel Principal Component Analysis (KPCA) is a key machine learning algorithm for extracting nonlinear features from data. In the presence of a large volume of high dimensional data collected in a distributed fashion, it becomes very…
Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication…
Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for…
Principal Subspace Analysis (PSA) -- and its sibling, Principal Component Analysis (PCA) -- is one of the most popular approaches for dimensionality reduction in signal processing and machine learning. But centralized PSA/PCA solutions are…
The Principal Component Analysis (PCA) is a data dimensionality reduction technique well-suited for processing data from sensor networks. It can be applied to tasks like compression, event detection, and event recognition. This technique is…
Due to the rapid growth of smart agents such as weakly connected computational nodes and sensors, developing decentralized algorithms that can perform computations on local agents becomes a major research direction. This paper considers the…
Distributed algorithms and theories are called for in this era of big data. Under weaker local signal-to-noise ratios, we improve upon the celebrated one-round distributed principal component analysis (PCA) algorithm designed in the spirit…
Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…
Recently years, the attempts on distilling mobile data into useful knowledge has been led to the deployment of machine learning algorithms at the network edge. Principal component analysis (PCA) is a classic technique for extracting the…
Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top…
Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…
We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix $A \in R^{m \times n},$ a rank parameter $k < rank(A)$, and an accuracy parameter $0 < \epsilon < 1$, we want to…
Learning augmented is a machine learning concept built to improve the performance of a method or model, such as enhancing its ability to predict and generalize data or features, or testing the reliability of the method by introducing noise…
Distributed computing is a standard way to scale up machine learning and data science algorithms to process large amounts of data. In such settings, avoiding communication amongst machines is paramount for achieving high performance. Rather…
We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of $m$ stores a sample of $n$ points sampled i.i.d. from a single unknown distribution. We study algorithms for…
The growing size of modern data sets brings many challenges to the existing statistical estimation approaches, which calls for new distributed methodologies. This paper studies distributed estimation for a fundamental statistical machine…
Principal component analysis (PCA) is one of the most popular dimension reduction techniques in statistics and is especially powerful when a multivariate distribution is concentrated near a lower-dimensional subspace. Multivariate extreme…