Related papers: High dimensional gaussian classification
This paper gives a theoretical analysis of high dimensional linear discrimination of Gaussian data. We study the excess risk of linear discriminant rules. We emphasis on the poor performances of standard procedures in the case when…
This thesis responds to the challenges of using a large number, such as thousands, of features in regression and classification problems. There are two situations where such high dimensional features arise. One is when high dimensional…
For high dimensional data, some of the standard statistical techniques do not work well. So modification or further development of statistical methods are necessary. In this paper, we explore these modifications. We start with the important…
Graphical models are commonly used to represent conditional dependence relationships between variables. There are multiple methods available for exploring them from high-dimensional data, but almost all of them rely on the assumption that…
We consider the classification problem of a high-dimensional mixture of two Gaussians with general covariance matrices. Using the replica method from statistical physics, we investigate the asymptotic behavior of a general class of…
Gaussian processes are a widely embraced technique for regression and classification due to their good prediction accuracy, analytical tractability and built-in capabilities for uncertainty quantification. However, they suffer from the…
The advent of modern technology, permitting the measurement of thousands of characteristics simultaneously, has given rise to floods of data characterized by many large or even huge datasets. This new paradigm presents extraordinary…
We propose an extensive simulation study to compare some variable selection procedures in a high-dimensional framework. Assuming that the relationship between the actives variables and the response variable is linear, the high-dimensional…
Density Estimation is one of the central areas of statistics whose purpose is to estimate the probability density function underlying the observed data. It serves as a building block for many tasks in statistical inference, visualization,…
In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type…
Clustering in high-dimensional spaces is a difficult problem which is recurrent in many domains, for example in image analysis. The difficulty is due to the fact that high-dimensional data usually live in different low-dimensional subspaces…
This article carries out a large dimensional analysis of standard regularized discriminant analysis classifiers designed on the assumption that data arise from a Gaussian mixture model with different means and covariances. The analysis…
This paper deals with Gibbs samplers that include high dimensional conditional Gaussian distributions. It proposes an efficient algorithm that avoids the high dimensional Gaussian sampling and relies on a random excursion along a small set…
A ubiquitous feature of data of our era is their extra-large sizes and dimensions. Analyzing such high-dimensional data poses significant challenges, since the feature dimension is often much larger than the sample size. This thesis…
It is now practically the norm for data to be very high dimensional in areas such as genetics, machine vision, image analysis and many others. When analyzing such data, parametric models are often too inflexible while nonparametric…
Gaussian process regression is widely used because of its ability to provide well-calibrated uncertainty estimates and handle small or sparse datasets. However, it struggles with high-dimensional data. One possible way to scale this…
This tutorial provides an exposition of a flexible geometric framework for high dimensional estimation problems with constraints. The tutorial develops geometric intuition about high dimensional sets, justifies it with some results of…
In many modern data sets, High dimension low sample size (HDLSS) data is prevalent in many fields of studies. There has been an increased focus recently on using machine learning and statistical methods to mine valuable information out of…
Gaussian processes (GPs) are widely used in nonparametric regression, classification and spatio-temporal modeling, motivated in part by a rich literature on theoretical properties. However, a well known drawback of GPs that limits their use…
We consider the problem of constructing nonparametric undirected graphical models for high-dimensional functional data. Most existing statistical methods in this context assume either a Gaussian distribution on the vertices or linear…