Related papers: Exact Dimensionality Selection for Bayesian PCA
Estimating intrinsic dimensionality of data is a classic problem in pattern recognition and statistics. Principal Component Analysis (PCA) is a powerful tool in discovering dimensionality of data sets with a linear structure; it, however,…
We discuss the problem of estimating the number of principal components in Principal Com- ponents Analysis (PCA). Despite of the importance of the problem and the multitude of solutions proposed in the literature, it comes as a surprise…
Principal component analysis (PCA) is very popular to perform dimension reduction. The selection of the number of significant components is essential but often based on some practical heuristics depending on the application. Only few works…
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in…
Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components…
Principal component analysis (PCA) is perhaps the most widely used method for data dimensionality reduction. A key question in PCA is deciding how many factors to retain. This manuscript describes a new approach to automatically selecting…
A fully Bayesian approach is proposed for ultrahigh-dimensional nonparametric additive models in which the number of additive components may be larger than the sample size, though ideally the true model is believed to include only a small…
Dimension reduction techniques are among the most essential analytical tools in the analysis of high-dimensional data. Generalized principal component analysis (PCA) is an extension to standard PCA that has been widely used to identify…
High dimensional data has introduced challenges that are difficult to address when attempting to implement classical approaches of statistical process control. This has made it a topic of interest for research due in recent years. However,…
In this paper, we develop new statistical theory for probabilistic principal component analysis models in high dimensions. The focus is the estimation of the noise variance, which is an important and unresolved issue when the number of…
Principal component analysis (PCA) is one of the most widely used dimension reduction and multivariate statistical techniques. From a probabilistic perspective, PCA seeks a low-dimensional representation of data in the presence of…
The real-life data have a complex and non-linear structure due to their nature. These non-linearities and the large number of features can usually cause problems such as the empty-space phenomenon and the well-known curse of dimensionality.…
When the dimension of data is comparable to or larger than the number of data samples, Principal Components Analysis (PCA) may exhibit problematic high-dimensional noise. In this work, we propose an Empirical Bayes PCA method that reduces…
We provide a probabilistic and infinitesimal view of how the principal component analysis procedure (PCA) can be generalized to analysis of nonlinear manifold valued data. Starting with the probabilistic PCA interpretation of the Euclidean…
We consider principal component analysis for contaminated data-set in the high dimensional regime, where the dimensionality of each observation is comparable or even more than the number of observations. We propose a deterministic…
A widely used method to create a continuous representation of a discrete data-set is regression analysis. When the regression model is not based on a mathematical description of the physics underlying the data, heuristic techniques play a…
We explore the theoretical and numerical property of a fully Bayesian model selection method in sparse ultrahigh-dimensional settings, i.e., $p\gg n$, where $p$ is the number of covariates and $n$ is the sample size. Our method consists of…
High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these…
We present a technique to perform dimensionality reduction on data that is subject to uncertainty. Our method is a generalization of traditional principal component analysis (PCA) to multivariate probability distributions. In comparison to…
Bayesian model selection provides a powerful framework for objectively comparing models directly from observed data, without reference to ground truth data. However, Bayesian model selection requires the computation of the marginal…