English
Related papers

Related papers: Improved Distributed Principal Component Analysis

200 papers

Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often thought of as a dimensionality reduction method, the purpose of PCA is actually two-fold: dimension reduction…

Machine Learning · Computer Science 2023-01-25 Arpita Gang , Waheed U. Bajwa

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…

Machine Learning · Statistics 2019-05-20 Davoud Ataee Tarzanagh , Mohamad Kazem Shirani Faradonbeh , George Michailidis

Kernel Principal Component Analysis (KPCA) is a key machine learning algorithm for extracting nonlinear features from data. In the presence of a large volume of high dimensional data collected in a distributed fashion, it becomes very…

Machine Learning · Computer Science 2016-02-16 Maria-Florina Balcan , Yingyu Liang , Le Song , David Woodruff , Bo Xie

Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication…

Computation · Statistics 2018-01-11 Jianqing Fan , Dong Wang , Kaizheng Wang , Ziwei Zhu

Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-22 Ni An , Steven Weber

Principal Subspace Analysis (PSA) -- and its sibling, Principal Component Analysis (PCA) -- is one of the most popular approaches for dimensionality reduction in signal processing and machine learning. But centralized PSA/PCA solutions are…

Machine Learning · Computer Science 2021-11-25 Arpita Gang , Bingqing Xiang , Waheed U. Bajwa

The Principal Component Analysis (PCA) is a data dimensionality reduction technique well-suited for processing data from sensor networks. It can be applied to tasks like compression, event detection, and event recognition. This technique is…

Networking and Internet Architecture · Computer Science 2010-03-13 Yann-Aël Le Borgne , Sylvain Raybaud , Gianluca Bontempi

Due to the rapid growth of smart agents such as weakly connected computational nodes and sensors, developing decentralized algorithms that can perform computations on local agents becomes a major research direction. This paper considers the…

Machine Learning · Computer Science 2021-02-09 Haishan Ye , Tong Zhang

Distributed algorithms and theories are called for in this era of big data. Under weaker local signal-to-noise ratios, we improve upon the celebrated one-round distributed principal component analysis (PCA) algorithm designed in the spirit…

Methodology · Statistics 2025-07-01 ZeYu Li , Xinsheng Zhang , Wang Zhou

Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…

Machine Learning · Computer Science 2021-11-30 Arpita Gang , Waheed U. Bajwa

Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…

Quantitative Methods · Quantitative Biology 2018-10-18 Luigi Leonardo Palese

Recently years, the attempts on distilling mobile data into useful knowledge has been led to the deployment of machine learning algorithms at the network edge. Principal component analysis (PCA) is a classic technique for extracting the…

Information Theory · Computer Science 2022-04-04 Zezhong Zhang , Guangxu Zhu , Rui Wang , Vincent K. N. Lau , Kaibin Huang

Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top…

Machine Learning · Computer Science 2013-10-25 Nikos Karampatziakis , Paul Mineiro

Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…

Machine Learning · Statistics 2017-05-19 Xianghui Luo , Robert J. Durrant

We study the Principal Component Analysis (PCA) problem in the distributed and streaming models of computation. Given a matrix $A \in R^{m \times n},$ a rank parameter $k < rank(A)$, and an accuracy parameter $0 < \epsilon < 1$, we want to…

Data Structures and Algorithms · Computer Science 2016-07-13 Christos Boutsidis , David P. Woodruff , Peilin Zhong

Learning augmented is a machine learning concept built to improve the performance of a method or model, such as enhancing its ability to predict and generalize data or features, or testing the reliability of the method by introducing noise…

Machine Learning · Computer Science 2024-01-09 Issam K. O Jabari , Shofiyah , Pradiptya Kahvi S , Novi Nur Putriwijaya , Novanto Yudistira

Distributed computing is a standard way to scale up machine learning and data science algorithms to process large amounts of data. In such settings, avoiding communication amongst machines is paramount for achieving high performance. Rather…

Machine Learning · Statistics 2021-05-04 Vasileios Charisopoulos , Austin R. Benson , Anil Damle

We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of $m$ stores a sample of $n$ points sampled i.i.d. from a single unknown distribution. We study algorithms for…

Machine Learning · Computer Science 2017-02-28 Dan Garber , Ohad Shamir , Nathan Srebro

The growing size of modern data sets brings many challenges to the existing statistical estimation approaches, which calls for new distributed methodologies. This paper studies distributed estimation for a fundamental statistical machine…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-04 Xi Chen , Jason D. Lee , He Li , Yun Yang

Principal component analysis (PCA) is one of the most popular dimension reduction techniques in statistics and is especially powerful when a multivariate distribution is concentrated near a lower-dimensional subspace. Multivariate extreme…

Methodology · Statistics 2025-07-15 Felix Reinbott , Anja Janßen
‹ Prev 1 2 3 10 Next ›