Related papers: Finite sample approximation results for principal …
Principal component analysis (PCA) aims at estimating the direction of maximal variability of a high-dimensional dataset. A natural question is: does this task become easier, and estimation more accurate, when we exploit additional…
Principal components analysis (PCA) is a classical method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. For a simple model of factor analysis type, it is proved that…
Principal component analysis (PCA) is a classical method for dimensionality reduction based on extracting the dominant eigenvectors of the sample covariance matrix. However, PCA is well known to behave poorly in the ``large $p$, small $n$''…
Principal component analysis (PCA) is a most frequently used statistical tool in almost all branches of data science. However, like many other statistical tools, there is sometimes the risk of misuse or even abuse. In this paper, we…
Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of…
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehat{\Sigma}$ that…
Principal Component Analysis (PCA) is a well known procedure to reduce intrinsic complexity of a dataset, essentially through simplifying the covariance structure or the correlation structure. We introduce a novel algebraic, model-based…
With the development of high-throughput technologies, principal component analysis (PCA) in the high-dimensional regime is of great interest. Most of the existing theoretical and methodological results for high-dimensional PCA are based on…
Principal Component Analysis (PCA) is one of the most commonly used statistical methods for data exploration, and for dimensionality reduction wherein the first few principal components account for an appreciable proportion of the…
Principal component analysis (PCA) is a widespread technique for data analysis that relies on the covariance-correlation matrix of the analyzed data. However to properly work with high-dimensional data, PCA poses severe mathematical…
Principal component analysis (PCA) is a widely used dimension reduction technique in machine learning and multivariate statistics. To improve the interpretability of PCA, various approaches to obtain sparse principal direction loadings have…
In this paper we analyze approximate methods for undertaking a principal components analysis (PCA) on large data sets. PCA is a classical dimension reduction method that involves the projection of the data onto the subspace spanned by the…
A general asymptotic framework is developed for studying consis- tency properties of principal component analysis (PCA). Our frame- work includes several previously studied domains of asymptotics as special cases and allows one to…
A system with many degrees of freedom can be characterized by a covariance matrix; principal components analysis (PCA) focuses on the eigenvalues of this matrix, hoping to find a lower dimensional description. But when the spectrum is…
Principal component analysis (PCA) is a popular dimension reduction technique often used to visualize high-dimensional data structures. In genomics, this can involve millions of variables, but only tens to hundreds of observations.…
We study attention mechanisms through the lens of a canonical unsupervised problem: principal component analysis (PCA). We show that, when trained on Gaussian data, both softmax and linear attention layers learn parameters that align with…
Principal component analysis (PCA) is a dimensionality reduction method in data analysis that involves diagonalizing the covariance matrix of the dataset. Recently, quantum algorithms have been formulated for PCA based on diagonalizing a…
The study of stability and sensitivity of statistical methods or algorithms with respect to their data is an important problem in machine learning and statistics. The performance of the algorithm under resampling of the data is a…
A number of settings arise in which it is of interest to predict Principal Component (PC) scores for new observations using data from an initial sample. In this paper, we demonstrate that naive approaches to PC score prediction can be…
Principal Component Analysis (PCA) is the workhorse tool for dimensionality reduction in this era of big data. While often overlooked, the purpose of PCA is not only to reduce data dimensionality, but also to yield features that are…