Related papers: A Bayesian factor analysis model for high-dimensio…
Principal component analysis (PCA) is very popular to perform dimension reduction. The selection of the number of significant components is essential but often based on some practical heuristics depending on the application. Only few works…
Big data is transforming our world, revolutionizing operations and analytics everywhere, from financial engineering to biomedical sciences. The complexity of big data often makes dimension reduction techniques necessary before conducting…
Factors models are routinely used to analyze high-dimensional data in both single-study and multi-study settings. Bayesian inference for such models relies on Markov Chain Monte Carlo (MCMC) methods which scale poorly as the number of…
Principal components analysis (PCA) is a well-known technique for approximating a tabular data set by a low rank matrix. Here, we extend the idea of PCA to handle arbitrary data sets consisting of numerical, Boolean, categorical, ordinal,…
We present a Bayesian model selection approach to estimate the intrinsic dimensionality of a high-dimensional dataset. To this end, we introduce a novel formulation of the probabilisitic principal component analysis model based on a…
Principal component analysis (PCA) is perhaps the most widely used method for data dimensionality reduction. A key question in PCA is deciding how many factors to retain. This manuscript describes a new approach to automatically selecting…
Principal component analysis (PCA) is arguably the most widely used approach for large-dimensional factor analysis. While it is effective when the factors are sufficiently strong, it can be inconsistent when the factors are weak and/or the…
Principal component analysis (PCA) is often used to analyze multivariate data together with cluster analysis, which depends on the number of principal components used. It is therefore important to determine the number of significant…
This is a tutorial and survey paper on factor analysis, probabilistic Principal Component Analysis (PCA), variational inference, and Variational Autoencoder (VAE). These methods, which are tightly related, are dimensionality reduction and…
Gaussian factor models have proven widely useful for parsimoniously characterizing dependence in multivariate data. There is a rich literature on their extension to mixed categorical and continuous variables, using latent Gaussian variables…
Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components…
We propose a multiple imputation method based on principal component analysis (PCA) to deal with incomplete continuous data. To reflect the uncertainty of the parameters from one imputation to the next, we use a Bayesian treatment of the…
Principal components analysis (PCA) is a classical method for the reduction of dimensionality of data in the form of n observations (or cases) of a vector with p variables. For a simple model of factor analysis type, it is proved that…
Detecting associations between microbial compositions and sample characteristics is one of the most important tasks in microbiome studies. Most of the existing methods apply univariate models to single microbial species separately, with…
Principal Component Analysis (PCA) is a very successful dimensionality reduction technique, widely used in predictive modeling. A key factor in its widespread use in this domain is the fact that the projection of a dataset onto its first…
This paper introduces a Projected Principal Component Analysis (Projected-PCA), which employs principal component analysis to the projected (smoothed) data matrix onto a given linear space spanned by covariates. When it applies to…
The advances of next-generation sequencing technology have accelerated study of the microbiome and stimulated the high throughput profiling of metagenomes. The large volume of sequenced data has encouraged the rise of various studies for…
Principal component analysis (PCA) is one of the most widely used dimension reduction and multivariate statistical techniques. From a probabilistic perspective, PCA seeks a low-dimensional representation of data in the presence of…
This paper considers the estimation and inference of the low-rank components in high-dimensional matrix-variate factor models, where each dimension of the matrix-variates ($p \times q$) is comparable to or greater than the number of…
Principal Component Analysis (PCA) is a dimension reduction technique. It produces inconsistent estimators when the dimensionality is moderate to high, which is often the problem in modern large-scale applications where algorithm…