Related papers: Instance-dependent Early Stopping

GRADSTOP: Early Stopping of Gradient Descent via Posterior Sampling

Machine learning models are often learned by minimising a loss function on the training data using a gradient descent algorithm. These models often suffer from overfitting, leading to a decline in predictive performance on unseen data. A…

Machine Learning · Computer Science 2026-01-28 Arash Jamshidi , Lauri Seppäläinen , Katsiaryna Haitsiukevich , Hoang Phuc Hau Luu , Anton Björklund , Kai Puolamäki

Early Stopping without a Validation Set

Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split…

Machine Learning · Computer Science 2017-06-07 Maren Mahsereci , Lukas Balles , Christoph Lassner , Philipp Hennig

Instance-Dependent Confidence and Early Stopping for Reinforcement Learning

Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure. Such problem-dependent behavior is not captured by worst-case analyses and has accordingly inspired…

Machine Learning · Statistics 2022-01-24 Koulik Khamaru , Eric Xia , Martin J. Wainwright , Michael I. Jordan

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference

Early Exiting is one of the most popular methods to achieve efficient inference. Current early exiting methods adopt the (weighted) sum of the cross entropy loss of all internal classifiers during training, imposing all these classifiers to…

Computation and Language · Computer Science 2024-04-09 Ziqian Zeng , Yihuai Hong , Hongliang Dai , Huiping Zhuang , Cen Chen

GradES: Significantly Faster Training in Transformers with Gradient-Based Early Stopping

Early stopping monitors global validation loss and halts all parameter updates simultaneously, which is computationally costly for large transformers due to the extended time required for validation inference. We propose \textit{GradES}, a…

Machine Learning · Computer Science 2025-10-20 Qifu Wen , Xi Zeng , Zihan Zhou , Shuaijun Liu , Mehdi Hosseinzadeh , Ningxin Su , Reza Rawassizadeh

Balancing Stability and Plasticity in Sequentially Trained Early-Exiting Neural Networks

Early-exiting neural networks enable adaptive inference by allowing inputs to exit at intermediate classifiers, reducing computation for easy samples while maintaining high accuracy. In practice, exits can be trained sequentially by…

Machine Learning · Computer Science 2026-05-08 Alaa Zniber , Ouassim Karrakchou , Mounir Ghogho

Conformal inference is (almost) free for neural networks trained with early stopping

Early stopping based on hold-out data is a popular regularization technique designed to mitigate overfitting and increase the predictive accuracy of neural networks. Models trained with early stopping often provide relatively accurate…

Machine Learning · Statistics 2023-06-28 Ziyi Liang , Yanfei Zhou , Matteo Sesia

Early stopping by correlating online indicators in neural networks

In order to minimize the generalization error in neural networks, a novel technique to identify overfitting phenomena when training the learner is formally introduced. This enables support of a reliable and trustworthy early stopping…

Machine Learning · Computer Science 2024-02-06 Manuel Vilares Ferro , Yerai Doval Mosquera , Francisco J. Ribadas Pena , Victor M. Darriba Bilbao

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

Early exiting allows instances to exit at different layers according to the estimation of difficulty. Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from…

Computation and Language · Computer Science 2022-03-04 Tianxiang Sun , Xiangyang Liu , Wei Zhu , Zhichao Geng , Lingling Wu , Yilong He , Yuan Ni , Guotong Xie , Xuanjing Huang , Xipeng Qiu

On Optimal Early Stopping: Over-informative versus Under-informative Parametrization

Early stopping is a simple and widely used method to prevent over-training neural networks. We develop theoretical results to reveal the relationship between the optimal early stopping time and model dimension as well as sample size of the…

Machine Learning · Computer Science 2022-02-25 Ruoqi Shen , Liyao Gao , Yi-An Ma

Being Patient and Persistent: Optimizing An Early Stopping Strategy for Deep Learning in Profiled Attacks

The absence of an algorithm that effectively monitors deep learning models used in side-channel attacks increases the difficulty of evaluation. If the attack is unsuccessful, the question is if we are dealing with a resistant implementation…

Cryptography and Security · Computer Science 2021-11-30 Servio Paguada , Lejla Batina , Ileana Buhan , Igor Armendariz

RAEE: A Robust Retrieval-Augmented Early Exit Framework for Efficient Inference

Deploying large language model inference remains challenging due to their high computational overhead. Early exit optimizes model inference by adaptively reducing the number of inference layers. Current methods typically train internal…

Computation and Language · Computer Science 2026-03-05 Lianming Huang , Shangyu Wu , Yufei Cui , Ying Xiong , Haibo Hu , Xue Liu , Tei-Wei Kuo , Nan Guan , Chun Jason Xue

Early Stopping for Deep Image Prior

Deep image prior (DIP) and its variants have showed remarkable potential for solving inverse problems in computer vision, without any extra training data. Practical DIP models are often substantially overparameterized. During the fitting…

Computer Vision and Pattern Recognition · Computer Science 2023-12-13 Hengkang Wang , Taihui Li , Zhong Zhuang , Tiancong Chen , Hengyue Liang , Ju Sun

Energy-Efficient Adaptive Machine Learning on IoT End-Nodes With Class-Dependent Confidence

Energy-efficient machine learning models that can run directly on edge devices are of great interest in IoT applications, as they can reduce network pressure and response latency, and improve privacy. An effective way to obtain…

Machine Learning · Computer Science 2022-04-08 Francesco Daghero , Alessio Burrello , Daniele Jahier Pagliari , Luca Benini , Enrico Macii , Massimo Poncino

Instance Weighted Incremental Evolution Strategies for Reinforcement Learning in Dynamic Environments

Evolution strategies (ES), as a family of black-box optimization algorithms, recently emerge as a scalable alternative to reinforcement learning (RL) approaches such as Q-learning or policy gradient, and are much faster when many central…

Machine Learning · Computer Science 2022-04-01 Zhi Wang , Chunlin Chen , Daoyi Dong

Early-stopped aggregation: Adaptive inference with computational efficiency

When considering a model selection or, more generally, an aggregation approach for adaptive statistical inference, it is often necessary to compute estimators over a wide range of model complexities including unnecessarily large models even…

Statistics Theory · Mathematics 2026-04-17 Ilsang Ohn , Shitao Fan , Jungbin Jun , Lizhen Lin

Independence-Encouraging Subsampling for Nonparametric Additive Models

The additive model is a popular nonparametric regression method due to its ability to retain modeling flexibility while avoiding the curse of dimensionality. The backfitting algorithm is an intuitive and widely used numerical approach for…

Methodology · Statistics 2023-02-28 Yi Zhang , Lin Wang , Xiaoke Zhang , HaiYing Wang

ACE: Adaptive Constraint-aware Early Stopping in Hyperparameter Optimization

Deploying machine learning models requires high model quality and needs to comply with application constraints. That motivates hyperparameter optimization (HPO) to tune model configurations under deployment constraints. The constraints…

Machine Learning · Computer Science 2022-08-08 Yi-Wei Chen , Chi Wang , Amin Saied , Rui Zhuang

NYTRO: When Subsampling Meets Early Stopping

Early stopping is a well known approach to reduce the time complexity for performing training and model selection of large scale learning machines. On the other hand, memory/space (rather than time) complexity is the main constraint in many…

Machine Learning · Statistics 2018-02-02 Tomas Angles , Raffaello Camoriano , Alessandro Rudi , Lorenzo Rosasco

Performance Control in Early Exiting to Deploy Large Models at the Same Cost of Smaller Ones

Early Exiting (EE) is a promising technique for speeding up inference by adaptively allocating compute resources to data points based on their difficulty. The approach enables predictions to exit at earlier layers for simpler samples while…

Machine Learning · Computer Science 2024-12-30 Mehrnaz Mofakhami , Reza Bayat , Ioannis Mitliagkas , Joao Monteiro , Valentina Zantedeschi