Related papers: EvoGrad: Efficient Gradient-Based Meta-Learning an…

EvoGrad: Metaheuristics in a Differentiable Wonderland

Differentiable programming has revolutionised optimisation by enabling efficient gradient-based training of complex models, such as Deep Neural Networks (NNs) with billions and trillions of parameters. However, traditional Evolutionary…

Neural and Evolutionary Computing · Computer Science 2025-06-10 Beatrice F. R. Citterio , Andrea Tangherloni

Scalable Meta-Learning via Mixed-Mode Differentiation

Gradient-based bilevel optimisation is a powerful technique with applications in hyperparameter optimisation, task adaptation, algorithm discovery, meta-learning more broadly, and beyond. It often requires differentiating through the…

Machine Learning · Computer Science 2025-06-11 Iurii Kemaev , Dan A Calian , Luisa M Zintgraf , Gregory Farquhar , Hado van Hasselt

Meta-Learning with Warped Gradient Descent

Learning an efficient update rule from data that promotes rapid learning of new tasks from the same distribution remains an open problem in meta-learning. Typically, previous works have approached this issue either by attempting to train a…

Machine Learning · Computer Science 2020-02-19 Sebastian Flennerhag , Andrei A. Rusu , Razvan Pascanu , Francesco Visin , Hujun Yin , Raia Hadsell

MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but…

Machine Learning · Computer Science 2021-08-31 Tim van Erven , Wouter M. Koolen , Dirk van der Hoeven

Scalable Second Order Optimization for Deep Learning

Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second…

Machine Learning · Computer Science 2021-03-08 Rohan Anil , Vineet Gupta , Tomer Koren , Kevin Regan , Yoram Singer

Decoder Choice Network for Meta-Learning

Meta-learning has been widely used for implementing few-shot learning and fast model adaptation. One kind of meta-learning methods attempt to learn how to control the gradient descent process in order to make the gradient-based learning…

Machine Learning · Computer Science 2019-11-20 Jialin Liu , Fei Chao , Longzhi Yang , Chih-Min Lin , Qiang Shen

Efficient Curvature-Aware Hypergradient Approximation for Bilevel Optimization

Bilevel optimization is a powerful tool for many machine learning problems, such as hyperparameter optimization and meta-learning. Estimating hypergradients (also known as implicit gradients) is crucial for developing gradient-based methods…

Optimization and Control · Mathematics 2025-05-06 Youran Dong , Junfeng Yang , Wei Yao , Jin Zhang

Go Beyond Your Means: Unlearning with Per-Sample Gradient Orthogonalization

Machine unlearning aims to remove the influence of problematic training data after a model has been trained. The primary challenge in machine unlearning is ensuring that the process effectively removes specified data without compromising…

Machine Learning · Computer Science 2026-03-10 Aviv Shamsian , Eitan Shaar , Aviv Navon , Gal Chechik , Ethan Fetaya

Optimizing ML Training with Metagradient Descent

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based…

Machine Learning · Statistics 2025-03-19 Logan Engstrom , Andrew Ilyas , Benjamin Chen , Axel Feldmann , William Moses , Aleksander Madry

DenoGrad: A Gradient-Based Framework for Data Refinement in Tabular and Time-Series Learning

In the Data-Centric Artificial Intelligence (AI) paradigm, improving data quality is essential for robust machine learning. However, many denoising methods rely on rigid statistical assumptions or require clean reference data, which limits…

Artificial Intelligence · Computer Science 2026-04-28 J. Javier Alonso-Ramos , Ignacio Aguilera-Martos , Francisco Herrera , Andrés Herrera-Poyatos

Hyperbolic Graph Neural Networks at Scale: A Meta Learning Approach

The progress in hyperbolic neural networks (HNNs) research is hindered by their absence of inductive bias mechanisms, which are essential for generalizing to new tasks and facilitating scalable learning over large datasets. In this paper,…

Machine Learning · Computer Science 2023-10-31 Nurendra Choudhary , Nikhil Rao , Chandan K. Reddy

Meta-Learning with Latent Embedding Optimization

Gradient-based meta-learning techniques are both widely applicable and proficient at solving challenging few-shot learning and fast adaptation problems. However, they have practical difficulties when operating on high-dimensional parameter…

Machine Learning · Computer Science 2019-03-27 Andrei A. Rusu , Dushyant Rao , Jakub Sygnowski , Oriol Vinyals , Razvan Pascanu , Simon Osindero , Raia Hadsell

Memory-Reduced Meta-Learning with Guaranteed Convergence

The optimization-based meta-learning approach is gaining increased traction because of its unique ability to quickly adapt to a new task using only small amounts of data. However, existing optimization-based meta-learning approaches, such…

Machine Learning · Computer Science 2024-12-17 Honglin Yang , Ji Ma , Xiao Yu

MetaGrad: Adaptive Gradient Quantization with Hypernetworks

A popular track of network compression approach is Quantization aware Training (QAT), which accelerates the forward pass during the neural network training and inference. However, not much prior efforts have been made to quantize and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Kaixin Xu , Alina Hui Xiu Lee , Ziyuan Zhao , Zhe Wang , Min Wu , Weisi Lin

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm

Learning to learn is a powerful paradigm for enabling models to learn from data more effectively and efficiently. A popular approach to meta-learning is to train a recurrent model to read in a training dataset as input and output the…

Machine Learning · Computer Science 2018-02-16 Chelsea Finn , Sergey Levine

GradMetaNet: An Equivariant Architecture for Learning on Gradients

Gradients of neural networks encode valuable information for optimization, editing, and analysis of models. Therefore, practitioners often treat gradients as inputs to task-specific algorithms, e.g. for pruning or optimization. Recent works…

Machine Learning · Computer Science 2025-10-14 Yoav Gelberg , Yam Eitan , Aviv Navon , Aviv Shamsian , Theo , Putterman , Michael Bronstein , Haggai Maron

Efficient Bilevel Optimization for Meta Label Correction in Noisy Label Learning

Training a deep neural network with noisy labels could reduce data annotation cost but may introduce noise into the learned model. In meta label correction approaches, an additional meta model besides the main model is trained with a small,…

Machine Learning · Computer Science 2026-05-19 Ba Hoang Anh Nguyen , Viet Cuong Ta

Meta Networks

Neural networks have been successfully applied in applications with a large amount of labeled data. However, the task of rapid generalization on new concepts with small training data while preserving performances on previously learned ones…

Machine Learning · Computer Science 2017-06-09 Tsendsuren Munkhdalai , Hong Yu

MetaDiff: Meta-Learning with Conditional Diffusion for Few-Shot Learning

Equipping a deep model the abaility of few-shot learning, i.e., learning quickly from only few examples, is a core challenge for artificial intelligence. Gradient-based meta-learning approaches effectively address the challenge by learning…

Machine Learning · Computer Science 2024-01-09 Baoquan Zhang , Chuyao Luo , Demin Yu , Huiwei Lin , Xutao Li , Yunming Ye , Bowen Zhang