Related papers: Random Subset Averaging
Representational similarity analysis (RSA) is a multivariate technique to investigate cortical representations of objects or constructs. While avoiding ill-posed matrix inversions that plague multivariate approaches in the presence of many…
We propose a novel conditional quantile prediction method based on complete subset averaging (CSA) for quantile regressions. All models under consideration are potentially misspecified and the dimension of regressors goes to infinity as the…
Machine learning techniques always aim to reduce the generalized prediction error. In order to reduce it, ensemble methods present a good approach combining several models that results in a greater forecasting capacity. The Random Machines…
Representational Similarity Analysis (RSA) is a popular method for analyzing neuroimaging and behavioral data. Here we evaluate the accuracy and reliability of RSA in the context of model selection, and compare it to that of regression.…
We study efficiency improvements in randomized experiments for estimating a vector of potential outcome means using regression adjustment (RA) when there are more than two treatment levels. We show that linear RA which estimates separate…
Variable screening methods have been shown to be effective in dimension reduction under the ultra-high dimensional setting. Most existing screening methods are designed to rank the predictors according to their individual contributions to…
We propose a novel sparse sliced inverse regression method based on random projections in a large $p$ small $n$ setting. Embedded in a generalized eigenvalue framework, the proposed approach finally reduces to parallel execution of…
Meta-learning involves training models on a variety of training tasks in a way that enables them to generalize well on new, unseen test tasks. In this work, we consider meta-learning within the framework of high-dimensional multivariate…
We present a novel adaptive random subspace learning algorithm (RSSL) for prediction purpose. This new framework is flexible where it can be adapted with any learning technique. In this paper, we tested the algorithm for regression and…
The challenge of Out-of-Distribution (OOD) generalization poses a foundational concern for the application of machine learning algorithms to risk-sensitive areas. Inspired by traditional importance weighting and propensity weighting…
The random forest (RF) algorithm has become a very popular prediction method for its great flexibility and promising accuracy. In RF, it is conventional to put equal weights on all the base learners (trees) to aggregate their predictions.…
We introduce a very general method for sparse and large-scale variable selection. The large-scale regression settings is such that both the number of parameters and the number of samples are extremely large. The proposed method is based on…
Sparse reduced rank regression is an essential statistical learning method. In the contemporary literature, estimation is typically formulated as a nonconvex optimization that often yields to a local optimum in numerical computation. Yet,…
There has been a great deal of recent interest in the development of spatial prediction algorithms for very large datasets and/or prediction domains. These methods have primarily been developed in the spatial statistics community, but there…
The motivation of this work is to improve the performance of standard stacking approaches or ensembles, which are composed of simple, heterogeneous base models, through the integration of the generation and selection stages for regression…
We propose a flexible ensemble classification framework, Random Subspace Ensemble (RaSE), for sparse classification. In the RaSE algorithm, we aggregate many weak learners, where each weak learner is a base classifier trained in a subspace…
Soft random sampling (SRS) is a simple yet effective approach for efficient training of large-scale deep neural networks when dealing with massive data. SRS selects a subset uniformly at random with replacement from the full data set in…
Automatic data augmentation (AutoDA) plays an important role in enhancing the generalization of neural networks. However, mainstream AutoDA methods often encounter two challenges: either the search process is excessively time-consuming,…
We present an autodifferentiable rejection sampling algorithm termed Rejection Sampling with Autodifferentiation (RSA). In conjunction with reweighting, we show that RSA can be used for efficient parameter estimation and model exploration.…
Ensemble methods combine the predictions of multiple models to improve performance, but they require significantly higher computation costs at inference time. To avoid these costs, multiple neural networks can be combined into one by…