Related papers: Structured variable selection and estimation

Automated Analysis of Experiments using Hierarchical Garrote

In this work, we propose an automatic method for the analysis of experiments that incorporates hierarchical relationships between the experimental variables. We use a modified version of nonnegative garrote method for variable selection…

Methodology · Statistics 2024-11-05 Wei-Yang Yu , V. Roshan Joseph

Hierarchical selection of variables in sparse high-dimensional regression

We study a regression model with a huge number of interacting variables. We consider a specific approximation of the regression function under two ssumptions: (i) there exists a sparse representation of the regression function in a…

Statistics Theory · Mathematics 2009-09-29 Peter J. Bickel , Ya'acov Ritov , Alexander B. Tsybakov

Group Regularized Estimation under Structural Hierarchy

Variable selection for models including interactions between explanatory variables often needs to obey certain hierarchical constraints. The weak or strong structural hierarchy requires that the existence of an interaction term implies at…

Statistics Theory · Mathematics 2016-11-10 Yiyuan She , Zhifeng Wang , He Jiang

Structured nonlinear variable selection

We investigate structured sparsity methods for variable selection in regression problems where the target depends nonlinearly on the inputs. We focus on general nonlinear functions not limiting a priori the function space to additive…

Machine Learning · Statistics 2018-05-17 Magda Gregorová , Alexandros Kalousis , Stéphane Marchand-Maillet

Variable selection in multiple regression with random design

We propose a method for variable selection in multiple regression with random predictors. This method is based on a criterion that permits to reduce the variable selection problem to a problem of estimating suitable permutation and…

Statistics Theory · Mathematics 2015-06-29 Alban Mbina Mbina , Guy Martial Nkiet , Assi Nguessan

An Easy-to-Implement Hierarchical Standardization for Variable Selection Under Strong Heredity Constraint

For many practical problems, the regression models follow the strong heredity property (also known as the marginality), which means they include parent main effects when a second-order effect is present. Existing methods rely mostly on…

Methodology · Statistics 2020-07-28 Kedong Chen , William Li , Sijian Wang

Flexible Variable Selection for Recovering Sparsity in Nonadditive Nonparametric Models

Variable selection for recovering sparsity in nonadditive nonparametric models has been challenging. This problem becomes even more difficult due to complications in modeling unknown interaction terms among high dimensional variables. There…

Methodology · Statistics 2012-06-14 Zaili Fang , Inyoung Kim , Patrick Schaumont

Tree-Structured Modelling of Categorical Predictors in Regression

Generalized linear and additive models are very efficient regression tools but the selection of relevant terms becomes difficult if higher order interactions are needed. In contrast, tree-based methods also known as recursive partitioning…

Methodology · Statistics 2015-04-21 Gerhard Tutz , Moritz Berger

Manifold Structured Prediction

Structured prediction provides a general framework to deal with supervised problems where the outputs have semantically rich structure. While classical approaches consider finite, albeit potentially huge, output spaces, in this paper we…

Machine Learning · Statistics 2018-06-27 Alessandro Rudi , Carlo Ciliberto , Gian Maria Marconi , Lorenzo Rosasco

Analysing Large Scale Structure: I. Weighted Scaling Indices and Constrained Randomisation

The method of constrained randomisation is applied to three-dimensional simulated galaxy distributions. With this technique we generate for a given data set surrogate data sets which have the same linear properties as the original data…

Astrophysics · Physics 2009-11-07 C. Raeth , W. Bunk , M. Huber , G. Morfill , J. Retzlaff , P. Schuecker

RafterNet: Probabilistic predictions in multi-response regression

A fully nonparametric approach for making probabilistic predictions in multi-response regression problems is introduced. Random forests are used as marginal models for each response variable and, as novel contribution of the present work,…

Machine Learning · Computer Science 2022-10-12 Marius Hofert , Avinash Prasad , Mu Zhu

Group COMBSS: Group Selection via Continuous Optimization

We present a new optimization method for the group selection problem in linear regression. In this problem, predictors are assumed to have a natural group structure and the goal is to select a small set of groups that best fits the…

Methodology · Statistics 2024-04-23 Anant Mathur , Sarat Moka , Benoit Liquet , Zdravko Botev

A Random-effects Approach to Regression Involving Many Categorical Predictors and Their Interactions

Linear model prediction with a large number of potential predictors is both statistically and computationally challenging. The traditional approaches are largely based on shrinkage selection/estimation methods, which are applicable even…

Methodology · Statistics 2024-09-17 Hanmei Sun , Jiangshan Zhang , Jiming Jiang

Hierarchical variable clustering based on the predictive strength between random vectors

A rank-invariant clustering of variables is introduced that is based on the predictive strength between groups of variables, i.e., two groups are assigned a high similarity if the variables in the first group contain high predictive…

Methodology · Statistics 2023-12-29 Sebastian Fuchs , Yuping Wang

Regression modeling on stratified data with the lasso

We consider the estimation of regression models on strata defined using a categorical covariate, in order to identify interactions between this categorical covariate and the other predictors. A basic approach requires the choice of a…

Statistics Theory · Mathematics 2016-11-09 Edouard Ollier , Vivian Viallon

Distributed variable screening for generalized linear models

In this article, we develop a distributed variable screening method for generalized linear models. This method is designed to handle situations where both the sample size and the number of covariates are large. Specifically, the proposed…

Methodology · Statistics 2024-05-09 Tianbo Diao , Lianqiang Qu , Bo Li , Liuquan Sun

Implicit Generative Copulas

Copulas are a powerful tool for modeling multivariate distributions as they allow to separately estimate the univariate marginal distributions and the joint dependency structure. However, known parametric copulas offer limited flexibility…

Machine Learning · Statistics 2021-11-11 Tim Janke , Mohamed Ghanmi , Florian Steinke

Exploiting Categorical Structure Using Tree-Based Methods

Standard methods of using categorical variables as predictors either endow them with an ordinal structure or assume they have no structure at all. However, categorical variables often possess structure that is more complicated than a linear…

Machine Learning · Statistics 2020-04-17 Brian Lucena

Imputing missing values with unsupervised random trees

This work proposes a non-iterative strategy for missing value imputations which is guided by similarity between observations, but instead of explicitly determining distances or nearest neighbors, it assigns observations to overlapping…

Machine Learning · Statistics 2019-11-25 David Cortes

Wasserstein Generative Regression

In this paper, we propose a new and unified approach for nonparametric regression and conditional distribution learning. Our approach simultaneously estimates a regression function and a conditional generator using a generative learning…

Machine Learning · Statistics 2023-06-28 Shanshan Song , Tong Wang , Guohao Shen , Yuanyuan Lin , Jian Huang