Related papers: A constrained minimum criterion for model selectio…
We consider Bayesian model selection in generalized linear models that are high-dimensional, with the number of covariates p being large relative to the sample size n, but sparse in that the number of active covariates is small compared to…
For regression model selection via maximum likelihood estimation, we adopt a vector representation of candidate models and study the likelihood ratio confidence region for the regression parameter vector of a full model. We show that when…
A standard goal of model evaluation and selection is to find a model that approximates the truth well while at the same time is as parsimonious as possible. In this paper we emphasize the point of view that the models under consideration…
If the assumed model does not accurately capture the underlying structure of the data, a statistical method is likely to yield sub-optimal results, and so model selection is crucial in order to conduct any statistical analysis. However, in…
Model selection consistency in the high-dimensional regression setting can be achieved only if strong assumptions are fulfilled. We therefore suggest to pursue a different goal, which we call a minimal class of models. The minimal class of…
Lasso and other regularization procedures are attractive methods for variable selection, subject to a proper choice of shrinkage parameter. Given a set of potential subsets produced by a regularization algorithm, a consistent model…
Composite likelihood has shown promise in settings where the number of parameters $p$ is large due to its ability to break down complex models into simpler components, thus enabling inference even when the full likelihood is not tractable.…
We consider the least-square linear regression problem with regularization by the $\ell^1$-norm, a problem usually referred to as the Lasso. In this paper, we first present a detailed asymptotic analysis of model consistency of the Lasso in…
We consider a new criterion-based approach to model selection in linear regression. Properties of selection criteria based on p-values of a likelihood ratio statistic are studied for families of linear regression models. We prove that such…
The model selection procedure is usually a single-criterion decision making in which we select the model that maximizes a specific metric in a specific set, such as the Validation set performance. We claim this is very naive and can perform…
The issue of model selection in applied research is of vital importance. Since the true model in such research is not known, which model should be used from among various potential ones is an empirical question. There might exist several…
Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed…
Often the goal of model selection is to choose a model for future prediction, and it is natural to measure the accuracy of a future prediction by squared error loss. Under the Bayesian approach, it is commonly perceived that the optimal…
Model selection criteria are one of the most important tools in statistics. Proofs showing a model selection criterion is asymptotically optimal are tailored to the type of model (linear regression, quantile regression, penalized…
One of the core applications of machine learning to knowledge discovery consists on building a function (a hypothesis) from a given amount of data (for instance a decision tree or a neural network) such that we can use it afterwards to…
Sparse feature selection is necessary when we fit statistical models, we have access to a large group of features, don't know which are relevant, but assume that most are not. Alternatively, when the number of features is larger than the…
In this paper, we review state-of-the-art methods for feature selection in statistics with an application-oriented eye. Indeed, sparsity is a valuable property and the profusion of research on the topic might have provided little guidance…
Fitting models to data is an important part of the practice of science. Advances in machine learning have made it possible to fit more -- and more complex -- models, but have also exacerbated a problem: when multiple models fit the data…
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the…
This paper considers the sample-efficiency of preference learning, which models and predicts human choices based on comparative judgments. The minimax optimal estimation error rate $\Theta(d/n)$ in classical estimation theory requires that…