English
Related papers

Related papers: Model Selection in High-Dimensional Linear Regress…

200 papers

This paper considers the problem of variable selection allowing for parameter instability. It distinguishes between signal and pseudo-signal variables that are correlated with the target variable, and noise variables that are not, and…

Econometrics · Economics 2024-07-17 Alexander Chudik , M. Hashem Pesaran , Mahrad Sharifvaghefi

Structured additive distributional copula regression allows to model the joint distribution of multivariate outcomes by relating all distribution parameters to covariates. Estimation via statistical boosting enables accounting for…

Gradient Boosted Decision Trees (GBDTs) are widely used for building ranking and relevance models in search and recommendation. Considerations such as latency and interpretability dictate the use of as few features as possible to train…

Machine Learning · Statistics 2021-09-07 Cuize Han , Nikhil Rao , Daria Sorokina , Karthik Subbian

Boosting methods are widely used in statistical learning to deal with high-dimensional data due to their variable selection feature. However, those methods lack straightforward ways to construct estimators for the precision of the…

Methodology · Statistics 2021-06-10 Boyao Zhang , Colin Griesbach , Cora Kim , Nadia Müller-Voggel , Elisabeth Bergherr

We present a new variable selection method based on model-based gradient boosting and randomly permuted variables. Model-based boosting is a tool to fit a statistical model while performing variable selection at the same time. A drawback of…

Machine Learning · Statistics 2017-02-16 Janek Thomas , Tobias Hepp , Andreas Mayr , Bernd Bischl

Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees…

Machine Learning · Computer Science 2026-05-12 Yi-Siang Wang , Kuan-Yu Chen , Yu-Chen Den , Darby Tien-Hao Chang

We present a new procedure for enhanced variable selection for component-wise gradient boosting. Statistical boosting is a computational approach that emerged from machine learning, which allows to fit regression models in the presence of…

With the insight of variance-bias decomposition, we design a new hybrid bagging-boosting algorithm named SBPMT for classification problems. For the boosting part of SBPMT, we propose a new tree model called Probit Model Tree (PMT) as base…

Machine Learning · Statistics 2023-11-07 Tian Qin , Wei-Min Huang

Large language models (LLMs) have demonstrated impressive ability in solving complex mathematical problems with multi-step reasoning and can be further enhanced with well-designed in-context learning (ICL) examples. However, this potential…

Computation and Language · Computer Science 2025-02-18 Beichen Zhang , Yuhong Liu , Xiaoyi Dong , Yuhang Zang , Pan Zhang , Haodong Duan , Yuhang Cao , Dahua Lin , Jiaqi Wang

High dimensional predictive regressions are useful in wide range of applications. However, the theory is mainly developed assuming that the model is stationary with time invariant parameters. This is at odds with the prevalent evidence for…

Econometrics · Economics 2019-10-09 Kashif Yousuf , Serena Ng

Modern biotechnologies often result in high-dimensional data sets with much more variables than observations (n $\ll$ p). These data sets pose new challenges to statistical analysis: Variable selection becomes one of the most important…

Machine Learning · Statistics 2014-11-06 Benjamin Hofner , Luigi Boccuto , Markus Göker

This paper proposes a one-covariate-at-a-time multiple testing (OCMT) approach to choose significant variables in high-dimensional nonparametric additive regression models. Similarly to Chudik, Kapetanios and Pesaran (2018), we consider the…

Econometrics · Economics 2024-05-15 Liangjun Su , Thomas Tao Yang , Yonghui Zhang , Qiankun Zhou

Boosting algorithms to simultaneously estimate and select predictor effects in statistical models have gained substantial interest during the last decade. This review article aims to highlight recent methodological developments regarding…

Methodology · Statistics 2014-11-19 Andreas Mayr , Harald Binder , Olaf Gefeller , Matthias Schmid

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

Subset selection for multiple linear regression aims to construct a regression model that minimizes errors by selecting a small number of explanatory variables. Once a model is built, various statistical tests and diagnostics are conducted…

Machine Learning · Statistics 2020-09-04 Seokhyun Chung , Young Woong Park , Taesu Cheong

Gradient boosting from the field of statistical learning is widely known as a powerful framework for estimation and selection of predictor effects in various regression models by adapting concepts from classification theory. Current…

Methodology · Statistics 2020-11-03 Colin Griesbach , Benjamin Säfken , Elisabeth Waldmann

We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival…

Methodology · Statistics 2008-12-18 Peter Bühlmann , Torsten Hothorn

Model-based testing (MBT) provides an automated approach for finding discrepancies between software models and their implementation. If we want to incorporate MBT into the fast and iterative software development process that is Continuous…

Software Engineering · Computer Science 2023-05-02 P. H. M. van Spaendonck

The state explosion problem and the exponentially computational complexity restrict the further applications of LTL model checking. To this end, this study tries to seek an acceptable approximate solution for LTL model checking by…

Logic in Computer Science · Computer Science 2019-02-19 Weijun Zhu , Jianwei Wang , Yongwen Fan

We prove that boosting with the squared error loss, $L_2$Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast as $O$(exp(sample size)), assuming that…

Statistics Theory · Mathematics 2016-08-16 Peter Bühlmann
‹ Prev 1 2 3 10 Next ›