Related papers: Analyzing the discrepancy principle for kernelized…

The Discrepancy Principle for Choosing Bandwidths in Kernel Density Estimation

We investigate the discrepancy principle for choosing smoothing parameters for kernel density estimation. The method is based on the distance between the empirical and estimated distribution functions. We prove some new positive and…

Statistics Theory · Mathematics 2015-03-19 Thoralf Mildenberger

Early stopping and polynomial smoothing in regression with reproducing kernels

In this paper, we study the problem of early stopping for iterative learning algorithms in a reproducing kernel Hilbert space (RKHS) in the nonparametric regression framework. In particular, we work with the gradient descent and (iterative)…

Machine Learning · Statistics 2024-11-26 Yaroslav Averyanov , Alain Celisse

Adaptive Stopping Rule for Kernel-based Gradient Descent Algorithms

In this paper, we propose an adaptive stopping rule for kernel-based gradient descent (KGD) algorithms. We introduce the empirical effective dimension to quantify the increments of iterations in KGD and derive an implementable early…

Machine Learning · Computer Science 2023-06-14 Xiangyu Chang , Shao-Bo Lin

Early Stopping of Untrained Convolutional Neural Networks

In recent years, new regularization methods based on (deep) neural networks have shown very promising empirical performance for the numerical solution of ill-posed problems, e.g., in medical imaging and imaging science. Due to the…

Numerical Analysis · Mathematics 2024-06-07 Tim Jahn , Bangti Jin

Early stopping and non-parametric regression: An optimal data-dependent stopping rule

The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm. Focusing on non-parametric regression in a reproducing kernel Hilbert space, we analyze the early stopping strategy…

Machine Learning · Statistics 2013-06-18 Garvesh Raskutti , Martin J. Wainwright , Bin Yu

On the Discrepancy Principle for Stochastic Gradient Descent

Stochastic gradient descent (SGD) is a promising numerical method for solving large-scale inverse problems. However, its theoretical properties remain largely underexplored in the lens of classical regularization theory. In this note, we…

Numerical Analysis · Mathematics 2020-07-22 Tim Jahn , Bangti Jin

An Uncertainty Principle for Linear Recurrent Neural Networks

We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on a simple…

Machine Learning · Computer Science 2025-02-14 Alexandre François , Antonio Orvieto , Francis Bach

Early Stopping for Nonparametric Testing

Early stopping of iterative algorithms is an algorithmic regularization method to avoid over-fitting in estimation and classification. In this paper, we show that early stopping can also be applied to obtain the minimax optimal testing in a…

Statistics Theory · Mathematics 2018-09-18 Meimei Liu , Guang Cheng

On the discrepancy principle

A simple proof of the convergence of the variational regularization, with the regularization parameter, chosen by the discrepancy principle, is given for linear operators under suitable assumptions. It is shown that the discrepancy…

Mathematical Physics · Physics 2007-05-23 A. G. Ramm

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Stochastic gradient descent algorithms for training linear and kernel predictors are gaining more and more importance, thanks to their scalability. While various methods have been proposed to speed up their convergence, the model selection…

Machine Learning · Computer Science 2014-06-17 Francesco Orabona

Error bounds for numerical differentiation using kernels of finite smoothness

We provide improved error bounds for kernel-based numerical differentiation in terms of growth functions when kernels are of a finite smoothness, such as polyharmonic splines, thin plate splines or Wendland kernels. In contrast to existing…

Numerical Analysis · Mathematics 2025-12-24 Oleg Davydov

Learning to Stop While Learning to Predict

There is a recent surge of interest in designing deep architectures based on the update steps in traditional algorithms, or learning neural networks to improve and replace traditional algorithms. While traditional algorithms have certain…

Machine Learning · Computer Science 2020-06-11 Xinshi Chen , Hanjun Dai , Yu Li , Xin Gao , Le Song

Gradient Descent Finds Over-Parameterized Neural Networks with Sharp Generalization for Nonparametric Regression

We study nonparametric regression by an over-parameterized two-layer neural network trained by gradient descent (GD) in this paper. We show that, if the neural network is trained by GD with early stopping, then the trained network renders a…

Machine Learning · Statistics 2025-11-07 Yingzhen Yang , Ping Li

Adaptive Parameter Selection for Kernel Ridge Regression

This paper focuses on parameter selection issues of kernel ridge regression (KRR). Due to special spectral properties of KRR, we find that delicate subdivision of the parameter interval shrinks the difference between two successive KRR…

Machine Learning · Computer Science 2023-12-12 Shao-Bo Lin

Learning as filtering: implications for spike-based plasticity

Most normative models in computational neuroscience describe the task of learning as the optimisation of a cost function with respect to a set of parameters. However, learning as optimisation fails to account for a time varying environment…

Neurons and Cognition · Quantitative Biology 2020-08-10 Jannes Jegminat , Jean-Pascal Pfister

Learning with Exact Invariances in Polynomial Time

We study the statistical-computational trade-offs for learning with exact invariances (or symmetries) using kernel regression. Traditional methods, such as data augmentation, group averaging, canonicalization, and frame-averaging, either…

Machine Learning · Computer Science 2026-02-05 Ashkan Soleymani , Behrooz Tahmasebi , Stefanie Jegelka , Patrick Jaillet

Sparse Spectrum Warped Input Measures for Nonstationary Kernel Learning

We establish a general form of explicit, input-dependent, measure-valued warpings for learning nonstationary kernels. While stationary kernels are ubiquitous and simple to use, they struggle to adapt to functions that vary in smoothness…

Machine Learning · Computer Science 2020-10-12 Anthony Tompkins , Rafael Oliveira , Fabio Ramos

A Probabilistic Oracle Inequality and Quantification of Uncertainty of a modified Discrepancy Principle for Statistical Inverse Problems

In this note we consider spectral cut-off estimators to solve a statistical linear inverse problem under arbitrary white noise. The truncation level is determined with a recently introduced adaptive method based on the classical discrepancy…

Numerical Analysis · Mathematics 2022-02-28 Tim Jahn

Learning Curves and Benign Overfitting of Spectral Algorithms in Large Dimensions

Existing large-dimensional theory for spectral algorithms resolves either the optimally tuned point or the interpolation limit, but leaves the under-regularized regime unexplored. We study the learning curve and benign overfitting of…

Machine Learning · Statistics 2026-04-28 Weihao Lu , Qian Lin , Yingcun Xia , Dongming Huang

Spectral Algorithms in Misspecified Regression: Convergence under Covariate Shift

This paper investigates the convergence properties of spectral algorithms -- a class of regularization methods originating from inverse problems -- under covariate shift. In this setting, the marginal distributions of inputs differ between…

Machine Learning · Statistics 2025-09-08 Ren-Rui Liu , Zheng-Chu Guo