Related papers: Adaptive Randomized Dimension Reduction on Massive…
Scalability of statistical estimators is of increasing importance in modern applications and dimension reduction is often used to extract relevant information from data. A variety of popular dimension reduction approaches can be framed as…
This paper presents a randomized algorithm for computing the near-optimal low-rank dynamic mode decomposition (DMD). Randomized algorithms are emerging techniques to compute low-rank matrix approximations at a fraction of the cost of…
Large models and enormous data are essential driving forces of the unprecedented successes achieved by modern algorithms, especially in scientific computing and machine learning. Nevertheless, the growing dimensionality and model…
Dimension reduction is often the first step in statistical modeling or prediction of multivariate spatial data. However, most existing dimension reduction techniques do not account for the spatial correlation between observations and do not…
We study adaptive data-dependent dimensionality reduction in the context of supervised learning in general metric spaces. Our main statistical contribution is a generalization bound for Lipschitz functions in metric spaces that are…
We consider multi-class classification problems for high dimensional data. Following the idea of reduced-rank linear discriminant analysis (LDA), we introduce a new dimension reduction tool with a flavor of supervised principal component…
Dimensionality reduction techniques play important roles in the analysis of big data. Traditional dimensionality reduction approaches, such as principal component analysis (PCA) and linear discriminant analysis (LDA), have been studied…
This work explores a novel approach for adaptive, differentiable parametrization of large-scale non-stationary random fields. Coupled with any gradient-based algorithm, the method can be applied to variety of optimization problems,…
We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly…
High-dimensional classification has become an increasingly important problem. In this paper we propose a "Multivariate Adaptive Stochastic Search" (MASS) approach which first reduces the dimension of the data space and then applies a…
This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on…
We propose dimension reduction methods for sparse, high-dimensional multivariate response regression models. Both the number of responses and that of the predictors may exceed the sample size. Sometimes viewed as complementary, predictor…
These notes are an overview of some classical linear methods in Multivariate Data Analysis. This is a good old domain, well established since the 60's, and refreshed timely as a key step in statistical learning. It can be presented as part…
Random projection is widely used as a method of dimension reduction. In recent years, its combination with standard techniques of regression and classification has been explored. Here we examine its use with principal component analysis…
Dimension reduction is often an important step in the analysis of high-dimensional data. PCA is a popular technique to find the best low-dimensional approximation of high-dimensional data. However, classical PCA is very sensitive to…
Latent variable models represent a useful tool for the analysis of complex data when the constructs of interest are not observable. A problem related to these models is that the integrals involved in the likelihood function cannot be solved…
In recent years, large language models (LLMs) have driven advances in natural language processing. Still, their growing scale has increased the computational burden, necessitating a balance between efficiency and performance. Low-rank…
In genetic studies, not only can the number of predictors obtained from microarray measurements be extremely large, there can also be multiple response variables. Motivated by such a situation, we consider semiparametric dimension reduction…
For very large datasets, random projections (RP) have become the tool of choice for dimensionality reduction. This is due to the computational complexity of principal component analysis. However, the recent development of randomized…
This paper, broadly speaking, covers the use of randomness in two main areas: low-rank approximation and kernel methods. Low-rank approximation is very important in numerical linear algebra. Many applications depend on matrix decomposition…