Related papers: A High-dimensional M-estimator Framework for Bi-le…
In this paper, we consider a Bayesian bi-level variable selection problem in high-dimensional regressions. In many practical situations, it is natural to assign group membership to each predictor. Examples include that genetic variants can…
When fitting statistical models, some predictors are often found to be correlated with each other, and functioning together. Many group variable selection methods are developed to select the groups of predictors that are closely related to…
In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be…
It becomes an interesting problem to identify subgroup structures in data analysis as populations are probably heterogeneous in practice. In this paper, we consider M-estimators together with both concave and pairwise fusion penalties,…
We propose a Machine Learning approach for optimal macroeconomic density forecasting in a high-dimensional setting where the underlying model exhibits a known group structure. Our approach is general enough to encompass specific forecasting…
Numerous variable selection methods rely on a two-stage procedure, where a sparsity-inducing penalty is used in the first stage to predict the support, which is then conveyed to the second stage for estimation or inference purposes. In this…
Multi-stage (designed) procedures, obtained by splitting the sampling budget suitably across stages, and designing the sampling at a particular stage based on information about the parameter obtained from previous stages, are often…
Sparsity-inducing penalties are useful tools for variable selection and they are also effective for regression settings where the data are functions. We consider the problem of selecting not only variables but also decision boundaries in…
Datasets containing both categorical and continuous variables are frequently encountered in many areas, and with the rapid development of modern measurement technologies, the dimensions of these variables can be very high. Despite the…
Sparse modelling or model selection with categorical data is challenging even for a moderate number of variables, because one parameter is roughly needed to encode one category or level. The Group Lasso is a well known efficient algorithm…
We consider nonlinear mixed effects models including high-dimensional covariates to model individual parameters variability. The objective is to identify relevant covariates among a large set under sparsity assumption and to estimate model…
In this paper, we address the problem of conducting statistical inference in settings involving large-scale data that may be high-dimensional and contaminated by outliers. The high volume and dimensionality of the data require distributed…
For factor model, the involved covariance matrix often has no row sparse structure because the common factors may lead some variables to strongly associate with many others. Under the ultra-high dimensional paradigm, this feature causes…
In high-dimensions, many variable selection methods, such as the lasso, are often limited by excessive variability and rank deficiency of the sample covariance matrix. Covariance sparsity is a natural phenomenon in high-dimensional…
Classification and probability estimation are fundamental tasks with broad applications across modern machine learning and data science, spanning fields such as biology, medicine, engineering, and computer science. Recent development of…
We mainly study the M-estimation method for the high-dimensional linear regression model, and discuss the properties of M-estimator when the penalty term is the local linear approximation. In fact, M-estimation method is a framework, which…
In recent years, bilevel approaches have become very popular to efficiently estimate high-dimensional hyperparameters of machine learning models. However, to date, binary parameters are handled by continuous relaxation and rounding…
We consider the high-dimensional discriminant analysis problem. For this problem, different methods have been proposed and justified by establishing exact convergence rates for the classification risk, as well as the l2 convergence results…
Consider the normal linear regression setup when the number of covariates p is much larger than the sample size n, and the covariates form correlated groups. The response variable y is not related to an entire group of covariates in all or…
The use of M-estimators in generalized linear regression models in high dimensional settings requires risk minimization with hard $L_0$ constraints. Of the known methods, the class of projected gradient descent (also known as iterative hard…