Related papers: Modelling Interactions in High-dimensional Data wi…
Finite mixture regression models are useful for modeling the relationship between response and predictors, arising from different subpopulations. In this article, we study high-dimensional predic- tors and high-dimensional response, and…
Beta regression is commonly employed when the outcome variable is a proportion. Since its conception, the approach has been widely used in applications spanning various scientific fields. A series of extensions have been proposed over time,…
In recent years, there has been considerable theoretical development regarding variable selection consistency of penalized regression techniques, such as the lasso. However, there has been relatively little work on quantifying the…
Inference for high-dimensional logistic regression models using penalized methods has been a challenging research problem. As an illustration, a major difficulty is the significant bias of the Lasso estimator, which limits its direct…
We consider regression problems where the number of predictors greatly exceeds the number of observations. We propose a method for variable selection that first estimates the regression function, yielding a "pre-conditioned" response…
This thesis responds to the challenges of using a large number, such as thousands, of features in regression and classification problems. There are two situations where such high dimensional features arise. One is when high dimensional…
Lasso-type estimators are routinely used to estimate high-dimensional time series models. The theoretical guarantees established for these estimators typically require the penalty level to be chosen in a suitable fashion often depending on…
We add a set of convex constraints to the lasso to produce sparse interaction models that honor the hierarchy restriction that an interaction only be included in a model if one or both variables are marginally important. We give a precise…
Among the most popular variable selection procedures in high-dimensional regression, Lasso provides a solution path to rank the variables and determines a cut-off position on the path to select variables and estimate coefficients. In this…
One popular approach for nonstructural economic and financial forecasting is to include a large number of economic and financial variables, which has been shown to lead to significant improvements for forecasting, for example, by the…
We present a novel approach to address the multi-agent sparse contextual linear bandit problem, in which the feature vectors have a high dimension $d$ whereas the reward function depends on only a limited set of features - precisely $s_0…
Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well-developed, the relative efficacy of different approaches in finite-sample…
Backdoor adjustment is a technique in causal inference for estimating interventional quantities from purely observational data. For example, in medical settings, backdoor adjustment can be used to control for confounding and estimate the…
In statistical learning framework with regressions, interactions are the contributions to the response variable from the products of the explanatory variables. In high-dimensional problems, detecting interactions is challenging due to…
The paper considers variable selection in linear regression models where the number of covariates is possibly much larger than the number of observations. High dimensionality of the data brings in many complications, such as (possibly…
Including pairwise interactions between the predictors of a regression model can produce better predicting models. However, to fit such interaction models on typical data sets in biology and other fields can often require solving enormous…
Feature interactions can contribute to a large proportion of variation in many prediction models. In the era of big data, the coexistence of high dimensionality in both responses and covariates poses unprecedented challenges in identifying…
High-dimensional time series datasets are becoming increasingly common in many areas of biological and social sciences. Some important applications include gene regulatory network reconstruction using time course gene expression data, brain…
The lasso is a popular method to induce shrinkage and sparsity in the solution vector (coefficients) of regression problems, particularly when there are many predictors relative to the number of observations. Solving the lasso in this…
In many environments only a tiny subset of all states yield high reward. In these cases, few of the interactions with the environment provide a relevant learning signal. Hence, we may want to preferentially train on those high-reward states…