English
Related papers

Related papers: Gradient-based Hyperparameter Optimization Over Lo…

200 papers

The Forward-Forward (FF) algorithm was recently proposed as a local learning method to address the limitations of backpropagation (BP), offering biological plausibility along with memory-efficient and highly parallelized computational…

Neural and Evolutionary Computing · Computer Science 2024-08-28 Yujie Wu , Siyuan Xu , Jibin Wu , Lei Deng , Mingkun Xu , Qinghao Wen , Guoqi Li

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent. These procedures mirror…

Machine Learning · Statistics 2017-12-13 Luca Franceschi , Michele Donini , Paolo Frasconi , Massimiliano Pontil

Fractional Gradient Descent (FGD) offers a novel and promising way to accelerate optimization by incorporating fractional calculus into machine learning. Although FGD has shown encouraging initial results across various optimization tasks,…

Machine Learning · Computer Science 2025-10-22 Jan Sobotka , Petr Šimánek , Pavel Kordík

Gradient-based bilevel optimisation is a powerful technique with applications in hyperparameter optimisation, task adaptation, algorithm discovery, meta-learning more broadly, and beyond. It often requires differentiating through the…

Machine Learning · Computer Science 2025-06-11 Iurii Kemaev , Dan A Calian , Luisa M Zintgraf , Gregory Farquhar , Hado van Hasselt

Forward gradient learning computes a noisy directional gradient and is a biologically plausible alternative to backprop for learning deep neural networks. However, the standard forward gradient algorithm, when applied naively, suffers from…

Machine Learning · Computer Science 2023-03-03 Mengye Ren , Simon Kornblith , Renjie Liao , Geoffrey Hinton

Given the limitations of backpropagation, perturbation-based gradient computation methods have recently gained focus for learning with only forward passes, also referred to as queries. Conventional forward learning consumes enormous queries…

Machine Learning · Computer Science 2025-03-11 Tao Ren , Zishi Zhang , Jinyang Jiang , Guanghao Li , Zeliang Zhang , Mingqian Feng , Yijie Peng

Parameter-Efficient Fine-Tuning (PEFT) has emerged as a key strategy for adapting large-scale pre-trained models to downstream tasks, but existing approaches face notable limitations. Addition-based methods, such as Adapters, introduce…

Computer Vision and Pattern Recognition · Computer Science 2026-05-05 Kenneth Yang , Wen-Li Wei , Jen-Chun Lin

Several machine learning applications involve the optimization of higher-order derivatives (e.g., gradients of gradients) during training, which can be expensive in respect to memory and computation even with automatic differentiation. As a…

Machine Learning · Computer Science 2020-11-26 Tianyu Pang , Kun Xu , Chongxuan Li , Yang Song , Stefano Ermon , Jun Zhu

Federated learning (FL) is an emerging technique for training machine learning models using geographically dispersed data collected by local entities. It includes local computation and synchronization steps. To reduce the communication…

Machine Learning · Computer Science 2020-03-23 Pengchao Han , Shiqiang Wang , Kin K. Leung

Classification as a supervised learning concept is an important content in machine learning. It aims at categorizing a set of data into classes. There are several commonly-used classification methods nowadays such as k-nearest neighbors,…

Machine Learning · Statistics 2021-10-27 Wen-Teng Chang

It has been shown that deep neural networks are prone to overfitting on biased training data. Towards addressing this issue, meta-learning employs a meta model for correcting the training bias. Despite the promising performances, super slow…

Machine Learning · Computer Science 2021-05-03 Youjiang Xu , Linchao Zhu , Lu Jiang , Yi Yang

Binary Neural Networks (BNNs) have garnered significant attention due to their immense potential for deployment on edge devices. However, the non-differentiability of the quantization function poses a challenge for the optimization of BNNs,…

Machine Learning · Computer Science 2024-12-17 Xinquan Chen , Junqi Gao , Biqing Qi , Dong Li , Yiang Luo , Fangyuan Li , Pengfei Li

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

With the growing size of pre-trained models, full fine-tuning and storing all the parameters for various downstream tasks is costly and infeasible. In this paper, we propose a new parameter-efficient fine-tuning method, Gradient-based…

Computer Vision and Pattern Recognition · Computer Science 2024-12-02 Zhi Zhang , Qizhe Zhang , Zijun Gao , Renrui Zhang , Ekaterina Shutova , Shiji Zhou , Shanghang Zhang

Forward Gradients - the idea of using directional derivatives in forward differentiation mode - have recently been shown to be utilizable for neural network training while avoiding problems generally associated with backpropagation gradient…

Machine Learning · Computer Science 2023-06-13 Louis Fournier , Stéphane Rivaud , Eugene Belilovsky , Michael Eickenberg , Edouard Oyallon

Recent trends towards training ever-larger language models have substantially improved machine learning performance across linguistic tasks. However, the huge cost of training larger models can make tuning them prohibitively expensive,…

Computation and Language · Computer Science 2022-09-13 Jared Lichtarge , Chris Alberti , Shankar Kumar

Gradient-free optimizers allow for tackling problems regardless of the smoothness or differentiability of their objective function, but they require many more iterations to converge when compared to gradient-based algorithms. This has made…

Machine Learning · Computer Science 2024-09-24 Gawel Kus , Miguel A. Bessa

Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other. However, it suffers the leakage of private information from uploading models. In addition, as…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-12-25 Kang Wei , Jun Li , Chuan Ma , Ming Ding , Feng Shu , Haitao Zhao , Wen Chen , Hongbo Zhu

Many gradient-based meta-learning methods assume a set of parameters that do not participate in inner-optimization, which can be considered as hyperparameters. Although such hyperparameters can be optimized using the existing gradient-based…

Machine Learning · Computer Science 2022-02-15 Hae Beom Lee , Hayeon Lee , Jaewoong Shin , Eunho Yang , Timothy Hospedales , Sung Ju Hwang

Despite large language models (LLMs) have achieved impressive achievements across numerous tasks, supervised fine-tuning (SFT) remains essential for adapting these models to specialized domains. However, SFT for domain specialization can be…

Computation and Language · Computer Science 2025-11-13 Yibai Liu , Shihang Wang , Zeming Liu , Zheming Song , Junzhe Wang , Jingjing Liu , Qingjie Liu , Yunhong Wang
‹ Prev 1 2 3 10 Next ›