Related papers: Variable selection in high-dimensional linear mode…
Standard penalized methods of variable selection and parameter estimation rely on the magnitude of coefficient estimates to decide which variables to include in the final model. However, coefficient estimates are unreliable when the design…
This paper focuses on variable selection for a partially linear single-index varying-coefficient model. A regularized variable selection procedure by combining basis function approximations with SCAD penalty is proposed. It can…
In this paper, we propose a novel variable selection approach in the framework of multivariate linear models taking into account the dependence that may exist between the responses. It consists in estimating beforehand the covariance matrix…
We consider a flexible semiparametric quantile regression model for analyzing high dimensional heterogeneous data. This model has several appealing features: (1) By considering different conditional quantiles, we may obtain a more complete…
In high-dimensional regression modelling, the number of candidate covariates to be included in the predictor is quite large, and variable selection is crucial. In this work, we propose a new penalty able to guarantee both sparse variable…
Variable selection plays a fundamental role in high-dimensional data analysis. Various methods have been developed for variable selection in recent years. Well-known examples are forward stepwise regression (FSR) and least angle regression…
A new empirical Bayes approach to variable selection in the context of generalized linear models is developed. The proposed algorithm scales to situations in which the number of putative explanatory variables is very large, possibly much…
Variable selection in cluster analysis is important yet challenging. It can be achieved by regularization methods, which realize a trade-off between the clustering accuracy and the number of selected variables by using a lasso-type penalty.…
We propose generalized additive partial linear models for complex data which allow one to capture nonlinear patterns of some covariates, in the presence of linear components. The proposed method improves estimation efficiency and increases…
We consider the task of estimating a high-dimensional directed acyclic graph, given observations from a linear structural equation model with arbitrary noise distribution. By exploiting properties of common random graphs, we develop a new…
In multivariate regression, when covariates are numerous, it is often reasonable to assume that only a small number of them has predictive information. In some medical applications for instance, it is believed that only a few genes out of…
The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the…
In big data analysis, a simple task such as linear regression can become very challenging as the variable dimension $p$ grows. As a result, variable screening is inevitable in many scientific studies. In recent years, randomized algorithms…
We consider the problem of simultaneous variable selection and constant coefficient identification in high-dimensional varying coefficient models based on B-spline basis expansion. Both objectives can be considered as some type of model…
Partition-wise models offer a flexible approach for modeling complex and multidimensional data that are capable of producing interpretable results. They are based on partitioning the observed data into regions, each of which is modeled with…
In variable selection, most existing screening methods focus on marginal effects and ignore dependence between covariates. To improve the performance of selection, we incorporate pairwise effects in covariates for screening and…
This paper proposes probabilistic conformal prediction (PCP), a predictive inference algorithm that estimates a target variable by a discontinuous predictive set. Given inputs, PCP construct the predictive set based on random samples from…
In various real-world problems, we are presented with classification problems with positive and unlabeled data, referred to as presence-only responses. In this paper, we study variable selection in the context of presence only responses…
We study the problem of variable selection in convex nonparametric regression. Under the assumption that the true regression function is convex and sparse, we develop a screening procedure to select a subset of variables that contains the…