Related papers: Model Selection in High-Dimensional Linear Regress…

Variable Selection in High Dimensional Linear Regressions with Parameter Instability

This paper considers the problem of variable selection allowing for parameter instability. It distinguishes between signal and pseudo-signal variables that are correlated with the target variable, and noise variables that are not, and…

Econometrics · Economics 2024-07-17 Alexander Chudik , M. Hashem Pesaran , Mahrad Sharifvaghefi

Enhanced variable selection for boosting sparser and less complex models in distributional copula regression

Structured additive distributional copula regression allows to model the joint distribution of multivariate outcomes by relating all distribution parameters to covariates. Estimation via statistical boosting enables accounting for…

Methodology · Statistics 2024-06-07 Annika Strömer , Nadja Klein , Christian Staerk , Florian Faschingbauer , Hannah Klinkhammer , Andreas Mayr

Scalable Feature Selection for (Multitask) Gradient Boosted Trees

Gradient Boosted Decision Trees (GBDTs) are widely used for building ranking and relevance models in search and recommendation. Considerations such as latency and interpretability dictate the use of as few features as possible to train…

Machine Learning · Statistics 2021-09-07 Cuize Han , Nikhil Rao , Daria Sorokina , Karthik Subbian

Bayesian Boosting for Linear Mixed Models

Boosting methods are widely used in statistical learning to deal with high-dimensional data due to their variable selection feature. However, those methods lack straightforward ways to construct estimators for the precision of the…

Methodology · Statistics 2021-06-10 Boyao Zhang , Colin Griesbach , Cora Kim , Nadia Müller-Voggel , Elisabeth Bergherr

Probing for sparse and fast variable selection with model-based boosting

We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of…

Machine Learning · Statistics 2017-02-16 Janek Thomas , Tobias Hepp , Andreas Mayr , Bernd Bischl

BoostLLM: Boosting-inspired LLM Fine-tuning for Few-shot Tabular Classification

Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees…

Machine Learning · Computer Science 2026-05-12 Yi-Siang Wang , Kuan-Yu Chen , Yu-Chen Den , Darby Tien-Hao Chang

Deselection of Base-Learners for Statistical Boosting -- with an Application to Distributional Regression

We present a new procedure for enhanced variable selection for component-wise gradient boosting. Statistical boosting is a computational approach that emerged from machine learning, which allows to fit regression models in the presence of…

Methodology · Statistics 2022-02-04 Annika Strömer , Christian Staerk , Nadja Klein , Leonie Weinhold , Stephanie Titze , Andreas Mayr

On Subagging Boosted Probit Model Trees

With the insight of variance-bias decomposition, we design a new hybrid bagging-boosting algorithm named SBPMT for classification problems. For the boosting part of SBPMT, we propose a new tree model called Probit Model Tree (PMT) as base…

Machine Learning · Statistics 2023-11-07 Tian Qin , Wei-Min Huang

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Large language models (LLMs) have demonstrated impressive ability in solving complex mathematical problems with multi-step reasoning and can be further enhanced with well-designed in-context learning (ICL) examples. However, this potential…

Computation and Language · Computer Science 2025-02-18 Beichen Zhang , Yuhong Liu , Xiaoyi Dong , Yuhang Zang , Pan Zhang , Haodong Duan , Yuhang Cao , Dahua Lin , Jiaqi Wang

Boosting High Dimensional Predictive Regressions with Time Varying Parameters

High dimensional predictive regressions are useful in wide range of applications. However, the theory is mainly developed assuming that the model is stationary with time invariant parameters. This is at odds with the prevalent evidence for…

Econometrics · Economics 2019-10-09 Kashif Yousuf , Serena Ng

Controlling false discoveries in high-dimensional situations: Boosting with stability selection

Modern biotechnologies often result in high-dimensional data sets with much more variables than observations (n $\ll$ p). These data sets pose new challenges to statistical analysis: Variable selection becomes one of the most important…

Machine Learning · Statistics 2014-11-06 Benjamin Hofner , Luigi Boccuto , Markus Göker

A One-Covariate-at-a-Time Method for Nonparametric Additive Models

This paper proposes a one-covariate-at-a-time multiple testing (OCMT) approach to choose significant variables in high-dimensional nonparametric additive regression models. Similarly to Chudik, Kapetanios and Pesaran (2018), we consider the…

Econometrics · Economics 2024-05-15 Liangjun Su , Thomas Tao Yang , Yonghui Zhang , Qiankun Zhou

Extending Statistical Boosting - An Overview of Recent Methodological Developments

Boosting algorithms to simultaneously estimate and select predictor effects in statistical models have gained substantial interest during the last decade. This review article aims to highlight recent methodological developments regarding…

Methodology · Statistics 2014-11-19 Andreas Mayr , Harald Binder , Olaf Gefeller , Matthias Schmid

Subset Selection for Multiple Linear Regression via Optimization

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

A Mathematical Programming Approach for Integrated Multiple Linear Regression Subset Selection and Validation

Subset selection for multiple linear regression aims to construct a regression model that minimizes errors by selecting a small number of explanatory variables. Once a model is built, various statistical tests and diagnostics are conducted…

Machine Learning · Statistics 2020-09-04 Seokhyun Chung , Young Woong Park , Taesu Cheong

Gradient Boosting for Linear Mixed Models

Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current…

Methodology · Statistics 2020-11-03 Colin Griesbach , Benjamin Säfken , Elisabeth Waldmann

Boosting Algorithms: Regularization, Prediction and Model Fitting

We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival…

Methodology · Statistics 2008-12-18 Peter Bühlmann , Torsten Hothorn

Efficient dynamic model based testing using greedy test case selection

Model-based testing (MBT) provides an automated approach for finding discrepancies between software models and their implementation. If we want to incorporate MBT into the fast and iterative software development process that is Continuous…

Software Engineering · Computer Science 2023-05-02 P. H. M. van Spaendonck

Approximate LTL model checking

The state explosion problem and the exponentially computational complexity restrict the further applications of LTL model checking. To this end, this study tries to seek an acceptable approximate solution for LTL model checking by…

Logic in Computer Science · Computer Science 2019-02-19 Weijun Zhu , Jianwei Wang , Yongwen Fan

Boosting for high-dimensional linear models

We prove that boosting with the squared error loss, $L_2$Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as $O$(exp(sample size)), assuming that…

Statistics Theory · Mathematics 2016-08-16 Peter Bühlmann