Related papers: The Theory Behind Overfitting, Cross Validation, R…

Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting

The authors are doing the readers of Statistical Science a true service with a well-written and up-to-date overview of boosting that originated with the seminal algorithms of Freund and Schapire. Equally, we are grateful for high-level…

Methodology · Statistics 2008-12-18 Andreas Buja , David Mease , Abraham J. Wyner

Boosting Algorithms: Regularization, Prediction and Model Fitting

We present a statistical perspective on boosting. Special emphasis is given to estimating potentially complex parametric or nonparametric models, including generalized linear and additive models as well as regression models for survival…

Methodology · Statistics 2008-12-18 Peter Bühlmann , Torsten Hothorn

On the Doubt about Margin Explanation of Boosting

Margin theory provides one of the most popular explanations to the success of \texttt{AdaBoost}, where the central point lies in the recognition that \textit{margin} is the key for characterizing the performance of \texttt{AdaBoost}. This…

Machine Learning · Computer Science 2013-08-29 Wei Gao , Zhi-Hua Zhou

batchboost: regularization for stabilizing training with resistance to underfitting & overfitting

Overfitting & underfitting and stable training are an important challenges in machine learning. Current approaches for these issues are mixup, SamplePairing and BC learning. In our work, we state the hypothesis that mixing many images…

Machine Learning · Computer Science 2020-01-22 Maciej A. Czyzewski

A new boosting algorithm based on dual averaging scheme

The fields of machine learning and mathematical optimization increasingly intertwined. The special topic on supervised learning and convex optimization examines this interplay. The training part of most supervised learning algorithms can…

Machine Learning · Computer Science 2015-07-14 Nan Wang

Unbiased Risk Estimation in the Normal Means Problem via Coupled Bootstrap Techniques

We develop a new approach for estimating the risk of an arbitrary estimator of the mean vector in the classical normal means problem. The key idea is to generate two auxiliary data vectors, by adding carefully constructed normal noise…

Statistics Theory · Mathematics 2024-04-25 Natalia L. Oliveira , Jing Lei , Ryan J. Tibshirani

Cross-validation: what does it estimate and how well does it do it?

Cross-validation is a widely-used technique to estimate prediction error, but its behavior is complex and not fully understood. Ideally, one would like to think that cross-validation estimates the prediction error for the model at hand, fit…

Methodology · Statistics 2024-03-12 Stephen Bates , Trevor Hastie , Robert Tibshirani

Multicalibration as Boosting for Regression

We study the connection between multicalibration and boosting for squared error regression. First we prove a useful characterization of multicalibration in terms of a ``swap regret'' like condition on squared error. Using this…

Machine Learning · Computer Science 2023-02-01 Ira Globus-Harris , Declan Harrison , Michael Kearns , Aaron Roth , Jessica Sorrell

Margin-Based Generalization Lower Bounds for Boosted Classifiers

Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of…

Machine Learning · Computer Science 2020-05-08 Allan Grønlund , Lior Kamma , Kasper Green Larsen , Alexander Mathiasen , Jelani Nelson

Boosting with early stopping: Convergence and consistency

Boosting is one of the most significant advances in machine learning for classification and regression. In its original and computationally flexible version, boosting seeks to minimize empirically a loss function in a greedy fashion. The…

Statistics Theory · Mathematics 2007-06-13 Tong Zhang , Bin Yu

Multicalibration Boosting: Theory, Convergence, and Transferability

Multicalibration extends classical calibration by requiring predictions to be unbiased over a rich collection of functions, encompassing both prediction slices and subpopulations. It has emerged as a powerful framework for fairness,…

Machine Learning · Statistics 2026-05-26 Hanxuan Ye , Hongzhe Li

Feature Learning Viewpoint of AdaBoost and a New Algorithm

The AdaBoost algorithm has the superiority of resisting overfitting. Understanding the mysteries of this phenomena is a very fascinating fundamental theoretical problem. Many studies are devoted to explaining it from statistical view and…

Machine Learning · Computer Science 2020-07-30 Fei Wang , Zhongheng Li , Fang He , Rong Wang , Weizhong Yu , Feiping Nie

Totally Corrective Boosting for Regularized Risk Minimization

Consideration of the primal and dual problems together leads to important new insights into the characteristics of boosting algorithms. In this work, we propose a general framework that can be used to design new boosting algorithms. A wide…

Artificial Intelligence · Computer Science 2011-12-13 Chunhua Shen , Hanxi Li , Nick Barnes

Stability via resampling: statistical problems beyond the real line

Model averaging techniques based on resampling methods (such as bootstrapping or subsampling) have been utilized across many areas of statistics, often with the explicit goal of promoting stability in the resulting output. We provide a…

Statistics Theory · Mathematics 2024-05-28 Jake A. Soloff , Rina Foygel Barber , Rebecca Willett

Re-scale boosting for regression and classification

Boosting is a learning scheme that combines weak prediction rules to produce a strong composite estimator, with the underlying intuition that one can obtain accurate prediction rules by combining "rough" ones. Although boosting is proved to…

Machine Learning · Computer Science 2015-05-07 Shaobo Lin , Yao Wang , Lin Xu

Benign Overfitting of Constant-Stepsize SGD for Linear Regression

There is an increasing realization that algorithmic inductive biases are central in preventing overfitting; empirically, we often see a benign overfitting phenomenon in overparameterized settings for natural learning algorithms, such as…

Machine Learning · Computer Science 2021-10-14 Difan Zou , Jingfeng Wu , Vladimir Braverman , Quanquan Gu , Sham M. Kakade

Optimization by gradient boosting

Gradient boosting is a state-of-the-art prediction technique that sequentially produces a model in the form of linear combinations of simple predictors---typically decision trees---by solving an infinite-dimensional convex optimization…

Statistics Theory · Mathematics 2017-07-18 Gérard Biau , Benoît Cadre

Bootstrapping the Cross-Validation Estimate

Cross-validation is a widely used technique for evaluating the performance of prediction models, ranging from simple binary classification to complex precision medicine strategies. It helps correct for optimism bias in error estimates,…

Methodology · Statistics 2025-09-05 Bryan Cai , Yuanhui Luo , Xinzhou Guo , Fabio Pellegrini , Menglan Pang , Carl de Moor , Changyu Shen , Vivek Charu , Lu Tian

Benign Overfitting in Out-of-Distribution Generalization of Linear Models

Benign overfitting refers to the phenomenon where an over-parameterized model fits the training data perfectly, including noise in the data, but still generalizes well to the unseen test data. While prior work provides some theoretical…

Machine Learning · Computer Science 2024-12-20 Shange Tang , Jiayun Wu , Jianqing Fan , Chi Jin

The Effects of Regularization and Data Augmentation are Class Dependent

Regularization is a fundamental technique to prevent over-fitting and to improve generalization performances by constraining a model's complexity. Current Deep Networks heavily rely on regularizers such as Data-Augmentation (DA) or…

Machine Learning · Computer Science 2022-04-12 Randall Balestriero , Leon Bottou , Yann LeCun