Related papers: Fast Randomized PCA for Sparse Data

Single-Pass PCA of Large High-Dimensional Data

Principal component analysis (PCA) is a fundamental dimension reduction tool in statistics and machine learning. For large and high-dimensional data, computing the PCA (i.e., the singular vectors corresponding to a number of dominant…

Data Structures and Algorithms · Computer Science 2017-04-26 Wenjian Yu , Yu Gu , Jian Li , Shenghua Liu , Yaohang Li

An algorithm for the principal component analysis of large data sets

Recently popularized randomized methods for principal component analysis (PCA) efficiently and reliably produce nearly optimal accuracy --- even on parallel processors --- unlike the classical (deterministic) alternatives. We adapt one of…

Computation · Statistics 2011-12-23 Nathan Halko , Per-Gunnar Martinsson , Yoel Shkolnisky , Mark Tygert

FAST-PCA: A Fast and Exact Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental data preprocessing tool in the world of machine learning. While PCA is often thought of as a dimensionality reduction method, the purpose of PCA is actually two-fold: dimension reduction…

Machine Learning · Computer Science 2023-01-25 Arpita Gang , Waheed U. Bajwa

Large-Scale Paralleled Sparse Principal Component Analysis

Principal component analysis (PCA) is a statistical technique commonly used in multivariate data analysis. However, PCA can be difficult to interpret and explain since the principal components (PCs) are linear combinations of the original…

Mathematical Software · Computer Science 2013-12-24 W. Liu , H. Zhang , D. Tao , Y. Wang , K. Lu

Large-Scale Sparse Principal Component Analysis with Application to Text Data

Sparse PCA provides a linear combination of small number of features that maximizes variance across data. Although Sparse PCA has apparent advantages compared to PCA, such as better interpretability, it is generally thought to be…

Machine Learning · Statistics 2012-10-29 Youwei Zhang , Laurent El Ghaoui

Sparse Principal Component Analysis via Variable Projection

Sparse principal component analysis (SPCA) has emerged as a powerful technique for modern data analysis, providing improved interpretation of low-rank structures by identifying localized spatial structures in the data and disambiguating…

Machine Learning · Statistics 2020-12-29 N. Benjamin Erichson , Peng Zheng , Krithika Manohar , Steven L. Brunton , J. Nathan Kutz , Aleksandr Y. Aravkin

Sparse Generalized Principal Component Analysis for Large-scale Applications beyond Gaussianity

Principal Component Analysis (PCA) is a dimension reduction technique. It produces inconsistent estimators when the dimensionality is moderate to high, which is often the problem in modern large-scale applications where algorithm…

Computation · Statistics 2016-01-29 Qiaoya Zhang , Yiyuan She

A Convex Sparse PCA for Feature Analysis

Principal component analysis (PCA) has been widely applied to dimensionality reduction and data pre-processing for different applications in engineering, biology and social science. Classical PCA and its variants seek for linear projections…

Machine Learning · Computer Science 2017-07-11 Xiaojun Chang , Feiping Nie , Yi Yang , Heng Huang

Sparse PCA With Multiple Components

Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves…

Optimization and Control · Mathematics 2025-12-02 Ryan Cory-Wright , Jean Pauphilet

Approximation Algorithms for Sparse Principal Component Analysis

Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and multivariate statistics. To improve the interpretability of PCA, various approaches to obtain sparse principal direction loadings have…

Data Structures and Algorithms · Computer Science 2021-06-07 Agniva Chowdhury , Petros Drineas , David P. Woodruff , Samson Zhou

Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality

Sparse principal component analysis (PCA) is a popular dimensionality reduction technique for obtaining principal components which are linear combinations of a small subset of the original features. Existing approaches cannot supply…

Optimization and Control · Mathematics 2022-02-22 Dimitris Bertsimas , Ryan Cory-Wright , Jean Pauphilet

Sparse Principal Components Analysis

Principal components analysis (PCA) is a classical method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. For a simple model of factor analysis type, it is proved that…

Statistics Theory · Mathematics 2009-01-29 Iain M Johnstone , Arthur Yu Lu

Generalized power method for sparse principal component analysis

In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant…

Optimization and Control · Mathematics 2008-12-01 Michel Journée , Yurii Nesterov , Peter Richtárik , Rodolphe Sepulchre

Gradient-based Sparse Principal Component Analysis with Extensions to Online Learning

Sparse principal component analysis (PCA) is an important technique for dimensionality reduction of high-dimensional data. However, most existing sparse PCA algorithms are based on non-convex optimization, which provide little guarantee on…

Methodology · Statistics 2019-11-20 Yixuan Qiu , Jing Lei , Kathryn Roeder

A randomized algorithm for principal component analysis

Principal component analysis (PCA) requires the computation of a low-rank approximation to a matrix containing the data being analyzed. In many applications of PCA, the best possible accuracy of any rank-deficient approximation is at most a…

Computation · Statistics 2010-06-04 Vladimir Rokhlin , Arthur Szlam , Mark Tygert

A random version of principal component analysis in data clustering

Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…

Quantitative Methods · Quantitative Biology 2018-10-18 Luigi Leonardo Palese

Online Distributed Estimation of Principal Eigenspaces

Principal components analysis (PCA) is a widely used dimension reduction technique with an extensive range of applications. In this paper, an online distributed algorithm is proposed for recovering the principal eigenspaces. We further…

Machine Learning · Statistics 2019-05-20 Davoud Ataee Tarzanagh , Mohamad Kazem Shirani Faradonbeh , George Michailidis

Combining Structured and Unstructured Randomness in Large Scale PCA

Principal Component Analysis (PCA) is a ubiquitous tool with many applications in machine learning including feature construction, subspace embedding, and outlier detection. In this paper, we present an algorithm for computing the top…

Machine Learning · Computer Science 2013-10-25 Nikos Karampatziakis , Paul Mineiro

Clustering and Feature Selection using Sparse Principal Component Analysis

In this paper, we study the application of sparse principal component analysis (PCA) to clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of…

Artificial Intelligence · Computer Science 2008-10-08 Ronny Luss , Alexandre d'Aspremont

PCA-RAG: Principal Component Analysis for Efficient Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm for grounding large language models in external knowledge sources, improving the precision of agents responses. However, high-dimensional language model embeddings,…

Machine Learning · Computer Science 2025-04-14 Arman Khaledian , Amirreza Ghadiridehkordi , Nariman Khaledian