Related papers: Adaptive Data Optimization: Dynamic Sample Selecti…

Efficient Adaptive Data Analysis over Dense Distributions

Modern data workflows are inherently adaptive, repeatedly querying the same dataset to refine and validate sequential decisions, but such adaptivity can lead to overfitting and invalid statistical inference. Adaptive Data Analysis (ADA)…

Machine Learning · Computer Science 2026-02-10 Joon Suk Huh

Knowing what to know: Implications of the choice of prior distribution on the behavior of adaptive design optimization

Adaptive design optimization (ADO) is a state-of-the-art technique for experimental design (Cavagnaro, Myung, Pitt, & Kujala, 2010). ADO dynamically identifies stimuli that, in expectation, yield the most information about a hypothetical…

Applications · Statistics 2024-07-10 Sabina J. Sloman , Daniel Cavagnaro , Stephen B. Broomell

Learning Distributionally Robust Models at Scale via Composite Optimization

To train machine learning models that are robust to distribution shifts in the data, distributionally robust optimization (DRO) has been proven very effective. However, the existing approaches to learning a distributionally robust model…

Machine Learning · Computer Science 2022-03-21 Farzin Haddadpour , Mohammad Mahdi Kamani , Mehrdad Mahdavi , Amin Karbasi

Data-driven Distributionally Robust Optimization over Time

Stochastic Optimization (SO) is a classical approach for optimization under uncertainty that typically requires knowledge about the probability distribution of uncertain parameters. As the latter is often unknown, Distributionally Robust…

Optimization and Control · Mathematics 2023-04-12 Kevin-Martin Aigner , Andreas Bärmann , Kristin Braun , Frauke Liers , Sebastian Pokutta , Oskar Schneider , Kartikey Sharma , Sebastian Tschuppik

Using Scaling Laws for Data Source Utility Estimation in Domain-Specific Pre-Training

We introduce a framework for optimizing domain-specific dataset construction in foundation model training. Specifically, we seek a cost-efficient way to estimate the quality of data sources (e.g. synthetically generated or filtered web…

Machine Learning · Computer Science 2025-07-31 Oleksiy Ostapenko , Charles Guille-Escuret , Luke Kumar , Max Tian , Denis Kocetkov , Gopeshh Subbaraj , Raymond Li , Joel Lamy-Poirier , Sebastien Paquet , Torsten Scholak

Aligning Distributionally Robust Optimization with Practical Deep Learning Needs

While traditional Deep Learning (DL) optimization methods treat all training samples equally, Distributionally Robust Optimization (DRO) adaptively assigns importance weights to different samples. However, a significant gap exists between…

Machine Learning · Computer Science 2025-09-26 Dmitrii Feoktistov , Igor Ignashin , Andrey Veprikov , Nikita Borovko , Alexander Bogdanov , Savelii Chezhegov , Aleksandr Beznosikov

Amortized Proximal Optimization

We propose a framework for online meta-optimization of parameters that govern optimization, called Amortized Proximal Optimization (APO). We first interpret various existing neural network optimizers as approximate stochastic proximal point…

Machine Learning · Computer Science 2022-03-02 Juhan Bae , Paul Vicol , Jeff Z. HaoChen , Roger Grosse

Scaling Laws for Optimal Data Mixtures

Large foundation models are typically trained on data from multiple domains, with the data mixture--the proportion of each domain used--playing a critical role in model performance. The standard approach to selecting this mixture relies on…

Machine Learning · Computer Science 2025-10-03 Mustafa Shukor , Louis Bethune , Dan Busbridge , David Grangier , Enrico Fini , Alaaeldin El-Nouby , Pierre Ablin

Posterior Distribution-assisted Evolutionary Dynamic Optimization as an Online Calibrator for Complex Social Simulations

The calibration of simulators for complex social systems aims to identify the optimal parameter that drives the output of the simulator best matching the target data observed from the system. As many social systems may change internally…

Neural and Evolutionary Computing · Computer Science 2026-01-28 Peng Yang , Zhenhua Yang , Boquan Jiang , Chenkai Wang , Ke Tang , Xin Yao

Data assimilation and online optimization with performance guarantees

This paper considers a class of real-time stochastic optimization problems dependent on an unknown probability distribution. In the considered scenario, data is streaming frequently while trying to reach a decision. Thus, we aim to devise a…

Optimization and Control · Mathematics 2020-09-08 Dan Li , Sonia Martinez

A Unified Approach to Adaptive Regularization in Online and Stochastic Optimization

We describe a framework for deriving and analyzing online optimization algorithms that incorporate adaptive, data-dependent regularization, also termed preconditioning. Such algorithms have been proven useful in stochastic optimization by…

Machine Learning · Computer Science 2017-06-21 Vineet Gupta , Tomer Koren , Yoram Singer

An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives

In this paper, we propose a practical online method for solving a class of distributionally robust optimization (DRO) with non-convex objectives, which has important applications in machine learning for improving the robustness of neural…

Machine Learning · Computer Science 2021-11-15 Qi Qi , Zhishuai Guo , Yi Xu , Rong Jin , Tianbao Yang

Adaptive Composite Online Optimization: Predictions in Static and Dynamic Environments

In the past few years, Online Convex Optimization (OCO) has received notable attention in the control literature thanks to its flexible real-time nature and powerful performance guarantees. In this paper, we propose new step-size rules and…

Optimization and Control · Mathematics 2023-01-18 Pedro Zattoni Scroccaro , Arman Sharifi Kolarijani , Peyman Mohajerin Esfahani

Navigating Scaling Laws: Compute Optimality in Adaptive Model Training

In recent years, the state-of-the-art in deep learning has been dominated by very large models that have been pre-trained on vast amounts of data. The paradigm is very simple: investing more computational resources (optimally) leads to…

Machine Learning · Computer Science 2024-05-24 Sotiris Anagnostidis , Gregor Bachmann , Imanol Schlag , Thomas Hofmann

ADORA: Training Reasoning Models with Dynamic Advantage Estimation on Reinforcement Learning

Reinforcement learning has become a cornerstone technique for developing reasoning models in complex tasks, ranging from mathematical problem-solving to imaginary reasoning. The optimization of these models typically relies on policy…

Machine Learning · Computer Science 2026-02-11 Qingnan Ren , Shiting Huang , Zhen Fang , Zehui Chen , Lin Chen , Lijun Li , Feng Zhao

Adaptive Training Distributions with Scalable Online Bilevel Optimization

Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this paradigm, the distribution of the large, heterogeneous pretraining data rarely matches that of the application domain. This work considers…

Machine Learning · Computer Science 2023-11-21 David Grangier , Pierre Ablin , Awni Hannun

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

Large time series models (LTMs) have emerged as powerful tools for universal forecasting, yet they often struggle with the inherent diversity and nonstationarity of real-world time series data, leading to an unsatisfactory trade-off between…

Machine Learning · Computer Science 2026-03-03 Yunzhong Qiu , Zhiyao Cen , Zhongyi Pei , Chen Wang , Jianmin Wang

Adaptive Decision-Objective Loss for Forecast-then-Optimize in Power Systems

Forecast-then-optimize is a widely-used framework for decision-making problems in power systems. Traditionally, statistical losses have been employed to train forecasting models, but recent research demonstrated that improved decision…

Systems and Control · Electrical Eng. & Systems 2023-12-22 Haipeng Zhang , Ran Li , Mingyang Sun , Teng Fei

Adaptive Data Dropout: Towards Self-Regulated Learning in Deep Neural Networks

Deep neural networks are typically trained by uniformly sampling large datasets across epochs, despite evidence that not all samples contribute equally throughout learning. Recent work shows that progressively reducing the amount of…

Machine Learning · Computer Science 2026-04-15 Amar Gahir , Varshil Patel , Shreyank N Gowda

Algorithmic Bias and Data Bias: Understanding the Relation between Distributionally Robust Optimization and Data Curation

Machine learning systems based on minimizing average error have been shown to perform inconsistently across notable subsets of the data, which is not exposed by a low average error for the entire dataset. In consequential social and…

Machine Learning · Computer Science 2021-06-18 Agnieszka Słowik , Léon Bottou