Related papers: Provably Data-driven Multiple Hyper-parameter Tuni…

How much data is sufficient to learn high-performing algorithms? Generalization guarantees for data-driven algorithm design

Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made…

Machine Learning · Computer Science 2021-04-27 Maria-Florina Balcan , Dan DeBlasio , Travis Dick , Carl Kingsford , Tuomas Sandholm , Ellen Vitercik

A study on tuning parameter selection for the high-dimensional lasso

High-dimensional predictive models, those with more measurements than observations, require regularization to be well defined, perform well empirically, and possess theoretical guarantees. The amount of regularization, often determined by…

Methodology · Statistics 2019-07-16 Darren Homrighausen , Daniel J. McDonald

Doubly Robust Semiparametric Inference Using Regularized Calibrated Estimation with High-dimensional Data

Consider semiparametric estimation where a doubly robust estimating function for a low-dimensional parameter is available, depending on two working models. With high-dimensional data, we develop regularized calibrated estimation as a…

Methodology · Statistics 2020-09-28 Satyajit Ghosh , Zhiqiang Tan

Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks

Modern regression problems often involve high-dimensional data and a careful tuning of the regularization hyperparameters is crucial to avoid overly complex models that may overfit the training data while guaranteeing desirable properties…

Machine Learning · Computer Science 2026-04-08 Maria-Florina Balcan , Saumya Goyal , Dravyansh Sharma

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates

We study learning to learn for regression problems through the lens of hyperparameter tuning. We propose the Langevin Gradient Descent Algorithm (LGD), which approximates the mean of the posterior distribution defined by the loss function…

Machine Learning · Computer Science 2026-04-16 Saumya Goyal , Rohith Rongali , Ritabrata Ray , Barnabás Póczos

Data-Driven Performance Guarantees for Classical and Learned Optimizers

We introduce a data-driven approach to analyze the performance of continuous optimization algorithms using generalization guarantees from statistical learning theory. We study classical and learned optimizers to solve families of parametric…

Optimization and Control · Mathematics 2025-10-07 Rajiv Sambharya , Bartolomeo Stellato

Learning Weighted Representations for Generalization Across Designs

Predictive models that generalize well under distributional shift are often desirable and sometimes crucial to building robust and reliable machine learning applications. We focus on distributional shift that arises in causal inference from…

Machine Learning · Statistics 2018-02-27 Fredrik D. Johansson , Nathan Kallus , Uri Shalit , David Sontag

Data-Driven Performance Guarantees for Parametric Optimization Problems

We propose a data-driven method to establish probabilistic performance guarantees for parametric optimization problems solved via iterative algorithms. Our approach addresses two key challenges: providing convergence guarantees to…

Optimization and Control · Mathematics 2025-10-31 Jingyi Huang , Paul Goulart , Kostas Margellos

A Survey of Tuning Parameter Selection for High-dimensional Regression

Penalized (or regularized) regression, as represented by Lasso and its variants, has become a standard technique for analyzing high-dimensional data when the number of variables substantially exceeds the sample size. The performance of…

Methodology · Statistics 2019-08-13 Yunan Wu , Lan Wang

High-dimensional regression with noisy and missing data: Provable guarantees with nonconvexity

Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependence, as well. We study these…

Statistics Theory · Mathematics 2015-03-19 Po-Ling Loh , Martin J. Wainwright

Provably tuning the ElasticNet across instances

An important unresolved challenge in the theory of regularization is to set the regularization coefficients of popular techniques like the ElasticNet with general provable guarantees. We consider the problem of tuning the regularization…

Machine Learning · Computer Science 2024-01-17 Maria-Florina Balcan , Mikhail Khodak , Dravyansh Sharma , Ameet Talwalkar

A Likelihood Ratio Framework for High Dimensional Semiparametric Regression

We propose a likelihood ratio based inferential framework for high dimensional semiparametric generalized linear models. This framework addresses a variety of challenging problems in high dimensional data analysis, including incomplete…

Machine Learning · Statistics 2015-11-24 Yang Ning , Tianqi Zhao , Han Liu

Guarantees for data-driven control of nonlinear systems using semidefinite programming: A survey

This survey presents recent research on determining control-theoretic properties and designing controllers with rigorous guarantees using semidefinite programming and for nonlinear systems for which no mathematical models but measured…

Optimization and Control · Mathematics 2023-11-06 Tim Martin , Thomas B. Schön , Frank Allgöwer

Generalization Bounds for Data-Driven Numerical Linear Algebra

Data-driven algorithms can adapt their internal structure or parameters to inputs from unknown application-specific distributions, by learning from a training sample of inputs. Several recent works have applied this approach to problems in…

Machine Learning · Computer Science 2022-06-17 Peter Bartlett , Piotr Indyk , Tal Wagner

Algorithm Configuration for Structured Pfaffian Settings

Data-driven algorithm design automatically adapts algorithms to specific application domains, achieving better performance. In the context of parameterized algorithms, this approach involves tuning the algorithm's hyperparameters using…

Machine Learning · Computer Science 2025-05-23 Maria-Florina Balcan , Anh Tuan Nguyen , Dravyansh Sharma

Stronger Generalization Guarantees for Robot Learning by Combining Generative Models and Real-World Data

We are motivated by the problem of learning policies for robotic systems with rich sensory inputs (e.g., vision) in a manner that allows us to guarantee generalization to environments unseen during training. We provide a framework for…

Robotics · Computer Science 2022-07-25 Abhinav Agarwal , Sushant Veer , Allen Z. Ren , Anirudha Majumdar

Demystifying Data-Driven Probabilistic Medium-Range Weather Forecasting

The recent revolution in data-driven methods for weather forecasting has lead to a fragmented landscape of complex, bespoke architectures and training strategies, obscuring the fundamental drivers of forecast accuracy. Here, we demonstrate…

Machine Learning · Computer Science 2026-01-27 Jean Kossaifi , Nikola Kovachki , Morteza Mardani , Daniel Leibovici , Suman Ravuri , Ira Shokar , Edoardo Calvello , Mohammad Shoaib Abbas , Peter Harrington , Ashay Subramaniam , Noah Brenowitz , Boris Bonev , Wonmin Byeon , Karsten Kreis , Dale Durran , Arash Vahdat , Mike Pritchard , Jan Kautz

A physics-aware, probabilistic machine learning framework for coarse-graining high-dimensional systems in the Small Data regime

The automated construction of coarse-grained models represents a pivotal component in computer simulation of physical systems and is a key enabler in various analysis and design tasks related to uncertainty quantification. Pertinent methods…

Machine Learning · Statistics 2019-09-11 Constantin Grigo , Phaedon-Stelios Koutsourelakis

Guided Hyperparameter Tuning Through Visualization and Inference

For deep learning practitioners, hyperparameter tuning for optimizing model performance can be a computationally expensive task. Though visualization can help practitioners relate hyperparameter settings to overall model performance,…

Human-Computer Interaction · Computer Science 2021-05-26 Hyekang Joo , Calvin Bao , Ishan Sen , Furong Huang , Leilani Battle

On the interplay between data structure and loss function in classification problems

One of the central puzzles in modern machine learning is the ability of heavily overparametrized models to generalize well. Although the low-dimensional structure of typical datasets is key to this behavior, most theoretical studies of…

Machine Learning · Computer Science 2021-10-13 Stéphane d'Ascoli , Marylou Gabrié , Levent Sagun , Giulio Biroli