Related papers: Communication Efficient Distributed Kernel Princip…

Improved Distributed Principal Component Analysis

We study the distributed computing setting in which there are multiple servers, each holding a set of points, who wish to compute functions on the union of their point sets. A key task in this setting is Principal Component Analysis (PCA),…

Machine Learning · Computer Science 2014-12-24 Maria-Florina Balcan , Vandana Kanchanapally , Yingyu Liang , David Woodruff

FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often thought of as a dimensionality reduction method, the purpose of PCA is actually two-fold: dimension reduction…

Machine Learning · Computer Science 2023-01-25 Arpita Gang , Waheed U. Bajwa

Learnable Faster Kernel-PCA for Nonlinear Fault Detection: Deep Autoencoder-Based Realization

Kernel principal component analysis (KPCA) is a well-recognized nonlinear dimensionality reduction method that has been widely used in nonlinear fault detection tasks. As a kernel trick-based method, KPCA inherits two major problems. First,…

Machine Learning · Computer Science 2024-10-28 Zelin Ren , Xuebing Yang , Yuchen Jiang , Wensheng Zhang

Deep Kernel Principal Component Analysis for Multi-level Feature Learning

Principal Component Analysis (PCA) and its nonlinear extension Kernel PCA (KPCA) are widely used across science and industry for data analysis and dimensionality reduction. Modern deep learning tools have achieved great empirical success,…

Machine Learning · Computer Science 2023-02-23 Francesco Tonin , Qinghua Tao , Panagiotis Patrinos , Johan A. K. Suykens

Upper and Lower Bounds on the Performance of Kernel PCA

Principal Component Analysis (PCA) is a popular method for dimension reduction and has attracted an unfailing interest for decades. More recently, kernel PCA (KPCA) has emerged as an extension of PCA but, despite its use in practice, a…

Machine Learning · Computer Science 2023-01-25 Maxime Haddouche , Benjamin Guedj , John Shawe-Taylor

A kernel Principal Component Analysis (kPCA) digest with a new backward mapping (pre-image reconstruction) strategy

Methodologies for multidimensionality reduction aim at discovering low-dimensional manifolds where data ranges. Principal Component Analysis (PCA) is very effective if data have linear structure. But fails in identifying a possible…

Numerical Analysis · Mathematics 2021-01-14 Alberto García-González , Antonio Huerta , Sergio Zlotnik , Pedro Díez

On the performance overhead tradeoff of distributed principal component analysis via data partitioning

Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-12-22 Ni An , Steven Weber

Distributed Principal Component Analysis for Wireless Sensor Networks

The Principal Component Analysis (PCA) is a data dimensionality reduction technique well-suited for processing data from sensor networks. It can be applied to tasks like compression, event detection, and event recognition. This technique is…

Networking and Internet Architecture · Computer Science 2010-03-13 Yann-Aël Le Borgne , Sylvain Raybaud , Gianluca Bontempi

Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis

We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of $m$ stores a sample of $n$ points sampled i.i.d. from a single unknown distribution. We study algorithms for…

Machine Learning · Computer Science 2017-02-28 Dan Garber , Ohad Shamir , Nathan Srebro

Distributed Estimation of Principal Eigenspaces

Principal component analysis (PCA) is fundamental to statistical machine learning. It extracts latent principal factors that contribute to the most variation of the data. When data are stored across multiple machines, however, communication…

Computation · Statistics 2018-01-11 Jianqing Fan , Dong Wang , Kaizheng Wang , Ziwei Zhu

Streaming Kernel Principal Component Analysis

Kernel principal component analysis (KPCA) provides a concise set of basis vectors which capture non-linear structures within large data sets, and is a central tool in data analysis and learning. To allow for non-linear relations, typically…

Data Structures and Algorithms · Computer Science 2015-12-17 Mina Ghashami , Daniel Perry , Jeff M. Phillips

Combining Structured and Unstructured Randomness in Large Scale PCA

Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top…

Machine Learning · Computer Science 2013-10-25 Nikos Karampatziakis , Paul Mineiro

A Linearly Convergent Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…

Machine Learning · Computer Science 2021-11-30 Arpita Gang , Waheed U. Bajwa

DeEPCA: Decentralized Exact PCA with Linear Convergence Rate

Due to the rapid growth of smart agents such as weakly connected computational nodes and sensors, developing decentralized algorithms that can perform computations on local agents becomes a major research direction. This paper considers the…

Machine Learning · Computer Science 2021-02-09 Haishan Ye , Tong Zhang

Improvement of variables interpretability in kernel PCA

Kernel methods have been proven to be a powerful tool for the integration and analysis of highthroughput technologies generated data. Kernels offer a nonlinear version of any linear algorithm solely based on dot products. The kernelized…

Applications · Statistics 2024-11-27 Mitja Briscik , Marie-Agnès Dillies , Sébastien Déjean

Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients

Nonlinear component analysis such as kernel Principle Component Analysis (KPCA) and kernel Canonical Correlation Analysis (KCCA) are widely used in machine learning, statistics and data analysis, but they can not scale up to big datasets.…

Machine Learning · Computer Science 2016-01-12 Bo Xie , Yingyu Liang , Le Song

Empirical Evaluation of Kernel PCA Approximation Methods in Classification Tasks

Kernel Principal Component Analysis (KPCA) is a popular dimensionality reduction technique with a wide range of applications. However, it suffers from the problem of poor scalability. Various approximation methods have been proposed in the…

Machine Learning · Computer Science 2017-12-13 Deena P. Francis , Kumudha Raimond

A random version of principal component analysis in data clustering

Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…

Quantitative Methods · Quantitative Biology 2018-10-18 Luigi Leonardo Palese

Online Distributed Estimation of Principal Eigenspaces

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…

Machine Learning · Statistics 2019-05-20 Davoud Ataee Tarzanagh , Mohamad Kazem Shirani Faradonbeh , George Michailidis

Approximate Kernel PCA Using Random Features: Computational vs. Statistical Trade-off

Kernel methods are powerful learning methodologies that allow to perform non-linear data analysis. Despite their popularity, they suffer from poor scalability in big data scenarios. Various approximation methods, including random feature…

Machine Learning · Statistics 2022-06-14 Bharath Sriperumbudur , Nicholas Sterge