Related papers: Meta-Learning Mini-Batch Risk Functionals

Revisiting Small Batch Training for Deep Neural Networks

Modern deep neural network training is typically based on mini-batch stochastic gradient optimization. While the use of large mini-batches increases the available computational parallelism, small batch training has been shown to provide…

Machine Learning · Computer Science 2018-04-23 Dominic Masters , Carlo Luschi

Statistical Learning with Conditional Value at Risk

We propose a risk-averse statistical learning framework wherein the performance of a learning algorithm is evaluated by the conditional value-at-risk (CVaR) of losses rather than the expected loss. We devise algorithms based on stochastic…

Machine Learning · Computer Science 2020-02-17 Tasuku Soma , Yuichi Yoshida

Efficient Large-Scale Learning of Minimax Risk Classifiers

Supervised learning with large-scale data usually leads to complex optimization problems, especially for classification tasks with multiple classes. Stochastic subgradient methods can enable efficient learning with a large number of samples…

Machine Learning · Computer Science 2025-11-25 Kartheek Bondugula , Santiago Mazuelas , Aritz Pérez

Training trajectories, mini-batch losses and the curious role of the learning rate

Stochastic gradient descent plays a fundamental role in nearly all applications of deep learning. However its ability to converge to a global minimum remains shrouded in mystery. In this paper we propose to study the behavior of the loss…

Machine Learning · Computer Science 2023-02-02 Mark Sandler , Andrey Zhmoginov , Max Vladymyrov , Nolan Miller

MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

Equipping a deep model the abaility of few-shot learning, i.e., learning quickly from only few examples, is a core challenge for artificial intelligence. Gradient-based meta-learning approaches effectively address the challenge by learning…

Machine Learning · Computer Science 2024-01-09 Baoquan Zhang , Chuyao Luo , Demin Yu , Huiwei Lin , Xutao Li , Yunming Ye , Bowen Zhang

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the…

Optimization and Control · Mathematics 2017-09-20 Tomoya Murata , Taiji Suzuki

Variable-Shot Adaptation for Online Meta-Learning

Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks. However, in many real world settings, it is more natural to…

Machine Learning · Computer Science 2020-12-15 Tianhe Yu , Xinyang Geng , Chelsea Finn , Sergey Levine

Distributionally robust minimization in meta-learning for system identification

Meta learning aims at learning how to solve tasks, and thus it allows to estimate models that can be quickly adapted to new scenarios. This work explores distributionally robust minimization in meta learning for system identification.…

Machine Learning · Computer Science 2025-06-24 Matteo Rufolo , Dario Piga , Marco Forgione

MetaModulation: Learning Variational Feature Hierarchies for Few-Shot Learning with Fewer Tasks

Meta-learning algorithms are able to learn a new task using previously learned knowledge, but they often require a large number of meta-training tasks which may not be readily available. To address this issue, we propose a method for…

Machine Learning · Computer Science 2023-05-18 Wenfang Sun , Yingjun Du , Xiantong Zhen , Fan Wang , Ling Wang , Cees G. M. Snoek

Incremental Meta-Learning via Indirect Discriminant Alignment

Majority of the modern meta-learning methods for few-shot classification tasks operate in two phases: a meta-training phase where the meta-learner learns a generic representation by solving multiple few-shot tasks sampled from a large…

Machine Learning · Computer Science 2020-04-23 Qing Liu , Orchid Majumder , Alessandro Achille , Avinash Ravichandran , Rahul Bhotika , Stefano Soatto

Submodular Batch Selection for Training Deep Neural Networks

Mini-batch gradient descent based methods are the de facto algorithms for training neural network architectures today. We introduce a mini-batch selection strategy based on submodular function maximization. Our novel submodular formulation…

Machine Learning · Computer Science 2019-06-21 K J Joseph , Vamshi Teja R , Krishnakant Singh , Vineeth N Balasubramanian

Coupling Adaptive Batch Sizes with Learning Rates

Mini-batch stochastic gradient descent and variants thereof have become standard for large-scale empirical risk minimization like the training of neural networks. These methods are usually used with a constant batch size chosen by simple…

Machine Learning · Computer Science 2017-06-29 Lukas Balles , Javier Romero , Philipp Hennig

Meta-Learning via Learned Loss

Typically, loss functions, regularization mechanisms and other important aspects of training parametric models are chosen heuristically from a limited set of options. In this paper, we take the first step towards automating this process,…

Machine Learning · Computer Science 2021-01-20 Sarah Bechtle , Artem Molchanov , Yevgen Chebotar , Edward Grefenstette , Ludovic Righetti , Gaurav Sukhatme , Franziska Meier

Meta-active Learning in Probabilistically-Safe Optimization

Learning to control a safety-critical system with latent dynamics (e.g. for deep brain stimulation) requires taking calculated risks to gain information as efficiently as possible. To address this problem, we present a…

Machine Learning · Computer Science 2020-07-09 Mariah L. Schrum , Mark Connolly , Eric Cole , Mihir Ghetiya , Robert Gross , Matthew C. Gombolay

Meta-Learning with Generalized Ridge Regression: High-dimensional Asymptotics, Optimality and Hyper-covariance Estimation

Meta-learning involves training models on a variety of training tasks in a way that enables them to generalize well on new, unseen test tasks. In this work, we consider meta-learning within the framework of high-dimensional multivariate…

Statistics Theory · Mathematics 2024-04-01 Yanhao Jin , Krishnakumar Balasubramanian , Debashis Paul

Gradient Agreement as an Optimization Objective for Meta-Learning

This paper presents a novel optimization method for maximizing generalization over tasks in meta-learning. The goal of meta-learning is to learn a model for an agent adapting rapidly when presented with previously unseen tasks. Tasks are…

Machine Learning · Computer Science 2018-10-19 Amir Erfan Eshratifar , David Eigen , Massoud Pedram

Robust Risk Minimization for Statistical Learning

We consider a general statistical learning problem where an unknown fraction of the training data is corrupted. We develop a robust learning method that only requires specifying an upper bound on the corrupted data fraction. The method…

Machine Learning · Statistics 2020-02-10 Muhammad Osama , Dave Zachariah , Peter Stoica

Learning an Explicit Hyperparameter Prediction Function Conditioned on Tasks

Meta learning has attracted much attention recently in machine learning community. Contrary to conventional machine learning aiming to learn inherent prediction rules to predict labels for new query data, meta learning aims to learn the…

Machine Learning · Computer Science 2023-07-04 Jun Shu , Deyu Meng , Zongben Xu

Risk-Averse Learning with Varying Risk Levels

In safety-critical decision-making, the environment may evolve over time, and the learner adjusts its risk level accordingly. This work investigates risk-averse online optimization in dynamic environments with varying risk levels, employing…

Optimization and Control · Mathematics 2025-12-30 Siyi Wang , Zifan Wang , Karl H. Johansson

Risk-aware linear bandits with convex loss

In decision-making problems such as the multi-armed bandit, an agent learns sequentially by optimizing a certain feedback. While the mean reward criterion has been extensively studied, other measures that reflect an aversion to adverse…

Machine Learning · Statistics 2023-03-28 Patrick Saux , Odalric-Ambrym Maillard