Related papers: Bootstrap tuning in ordered model selection
We consider penalized extremum estimation of a high-dimensional, possibly nonlinear model that is sparse in the sense that most of its parameters are zero but some are not. We use the SCAD penalty function, which provides model selection…
The bootstrap is a widely used procedure for statistical inference because of its simplicity and attractive statistical properties. However, the vanilla version of bootstrap is no longer feasible computationally for many modern massive…
We consider the problem of estimating how well a model class is capable of fitting a distribution of labeled data. We show that it is often possible to accurately estimate this "learnability" even when given an amount of data that is too…
A multiplier bootstrap procedure for construction of likelihood-based confidence sets is considered for finite samples and a possible model misspecification. Theoretical results justify the bootstrap validity for a small or moderate sample…
Model averaging has gained significant attention in recent years due to its ability of fusing information from different models. The critical challenge in frequentist model averaging is the choice of weight vector. The bootstrap method,…
Multiple systems estimation using a Poisson loglinear model is a standard approach to quantifying hidden populations where data sources are based on lists of known cases. Information criteria are often used for selecting between the large…
In a standard classification framework a set of trustworthy learning data are employed to build a decision rule, with the final aim of classifying unlabelled units belonging to the test set. Therefore, unreliable labelled observations,…
The Bootstrap method application in simulation supposes that value of random variables are not generated during the simulation process but extracted from available sample populations. In the case of Hierarchical Bootstrap the function of…
In distributed, or privacy-preserving learning, we are often given a set of probabilistic models estimated from different local repositories, and asked to combine them into a single model that gives efficient statistical estimation. A…
Although the methods of bagging and random forests are some of the most widely used prediction methods, relatively little is known about their algorithmic convergence. In particular, there are not many theoretical guarantees for deciding…
Regularized regression approaches such as the Lasso have been widely adopted for constructing sparse linear models in high-dimensional datasets. A complexity in fitting these models is the tuning of the parameters which control the level of…
We study the out-of-sample properties of robust empirical optimization problems with smooth $\phi$-divergence penalties and smooth concave objective functions, and develop a theory for data-driven calibration of the non-negative "robustness…
When randomized ensemble methods such as bagging and random forests are implemented, a basic question arises: Is the ensemble large enough? In particular, the practitioner desires a rigorous guarantee that a given ensemble will perform…
The statistical regression technique is an extraordinarily essential data fitting tool to explore the potential possible generation mechanism of the random phenomenon. Therefore, the model selection or the variable selection is becoming…
In this paper, a new family of resampling-based penalization procedures for model selection is defined in a general framework. It generalizes several methods, including Efron's bootstrap penalization and the leave-one-out penalization…
The paper deals with the non-parametric estimation in the regression with the multiplicative noise. Using the local polynomial fitting and the bayesian approach, we construct the minimax on isotropic H\"older class estimator. Next applying…
The partially linear binary choice model can be used for estimating structural equations where nonlinearity may appear due to diminishing marginal returns, different life cycle regimes, or hectic physical phenomena. The inference procedure…
Reliable forward uncertainty quantification in engineering requires methods that account for aleatory and epistemic uncertainties. In many applications, epistemic effects arising from uncertain parameters and model form dominate prediction…
We present a new family of model selection algorithms based on the resampling heuristics. It can be used in several frameworks, do not require any knowledge about the unknown law of the data, and may be seen as a generalization of local…
We consider the least-square linear regression problem with regularization by the $\ell^1$-norm, a problem usually referred to as the Lasso. In this paper, we first present a detailed asymptotic analysis of model consistency of the Lasso in…