Related papers: Debugging using Orthogonal Gradient Descent

Orthogonal Gradient Descent for Continual Learning

Neural networks are achieving state of the art and sometimes super-human performance on learning tasks across a variety of domains. Whenever these problems require learning in a continual or sequential manner, however, neural networks…

Machine Learning · Computer Science 2019-10-17 Mehrdad Farajtabar , Navid Azizan , Alex Mott , Ang Li

Online Learning and Unlearning

We formalize the problem of online learning-unlearning, where a model is updated sequentially in an online setting while accommodating unlearning requests between updates. After a data point is unlearned, all subsequent outputs must be…

Machine Learning · Computer Science 2025-05-14 Yaxi Hu , Bernhard Schölkopf , Amartya Sanyal

ONG: Orthogonal Natural Gradient Descent

Orthogonal Gradient Descent (OGD) has emerged as a powerful method for continual learning. However, its Euclidean projections do not leverage the underlying information-geometric structure of the problem, which can lead to suboptimal…

Machine Learning · Computer Science 2025-12-09 Yajat Yadav , Patrick Mendoza , Jathin Korrapati

GDOD: Effective Gradient Descent using Orthogonal Decomposition for Multi-Task Learning

Multi-task learning (MTL) aims at solving multiple related tasks simultaneously and has experienced rapid growth in recent years. However, MTL models often suffer from performance degeneration with negative transfer due to learning several…

Machine Learning · Computer Science 2023-02-01 Xin Dong , Ruize Wu , Chao Xiong , Hai Li , Lei Cheng , Yong He , Shiyou Qian , Jian Cao , Linjian Mo

Online Learning with Inexact Proximal Online Gradient Descent Algorithms

We consider non-differentiable dynamic optimization problems such as those arising in robotics and subspace tracking. Given the computational constraints and the time-varying nature of the problem, a low-complexity algorithm is desirable,…

Optimization and Control · Mathematics 2019-02-20 Rishabh Dixit , Amrit Singh Bedi , Ruchi Tripathi , Ketan Rajawat

Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning Interference with Gradient Projection

Recent data-privacy laws have sparked interest in machine unlearning, which involves removing the effect of specific training samples from a learnt model as if they were never present in the original training dataset. The challenge of…

Machine Learning · Computer Science 2023-12-08 Tuan Hoang , Santu Rana , Sunil Gupta , Svetha Venkatesh

Go Beyond Your Means: Unlearning with Per-Sample Gradient Orthogonalization

Machine unlearning aims to remove the influence of problematic training data after a model has been trained. The primary challenge in machine unlearning is ensuring that the process effectively removes specified data without compromising…

Machine Learning · Computer Science 2026-03-10 Aviv Shamsian , Eitan Shaar , Aviv Navon , Gal Chechik , Ethan Fetaya

Cogradient Descent for Dependable Learning

Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative. Treating the coupled variables independently while ignoring the interaction, however, leads to an insufficient optimization…

Machine Learning · Computer Science 2021-06-22 Runqi Wang , Baochang Zhang , Li'an Zhuo , Qixiang Ye , David Doermann

Optimizing ML Training with Metagradient Descent

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based…

Machine Learning · Statistics 2025-03-19 Logan Engstrom , Andrew Ilyas , Benjamin Chen , Axel Feldmann , William Moses , Aleksander Madry

General Greedy De-bias Learning

Neural networks often make predictions relying on the spurious correlations from the datasets rather than the intrinsic properties of the task of interest, facing sharp degradation on out-of-distribution (OOD) test data. Existing de-bias…

Machine Learning · Computer Science 2023-01-20 Xinzhe Han , Shuhui Wang , Chi Su , Qingming Huang , Qi Tian

An Effective Data-Driven Approach for Localizing Deep Learning Faults

Deep Learning (DL) applications are being used to solve problems in critical domains (e.g., autonomous driving or medical diagnosis systems). Thus, developers need to debug their systems to ensure that the expected behavior is delivered.…

Software Engineering · Computer Science 2023-07-19 Mohammad Wardat , Breno Dantas Cruz , Wei Le , Hridesh Rajan

On the Theory of Continual Learning with Gradient Descent for Neural Networks

Continual learning, the ability of a model to adapt to an ongoing sequence of tasks without forgetting earlier ones, is a central goal of artificial intelligence. To better understand its underlying mechanisms, we study the limitations of…

Machine Learning · Statistics 2026-04-21 Hossein Taheri , Avishek Ghosh , Arya Mazumdar

Targeted Gradient Descent: A Novel Method for Convolutional Neural Networks Fine-tuning and Online-learning

A convolutional neural network (ConvNet) is usually trained and then tested using images drawn from the same distribution. To generalize a ConvNet to various tasks often requires a complete training dataset that consists of images drawn…

Computer Vision and Pattern Recognition · Computer Science 2021-10-01 Junyu Chen , Evren Asma , Chung Chan

Descend or Rewind? Stochastic Gradient Descent Unlearning

Machine unlearning algorithms aim to remove the impact of selected training data from a model without the computational expenses of retraining from scratch. Two such algorithms are ``Descent-to-Delete" (D2D) and ``Rewind-to-Delete" (R2D),…

Machine Learning · Computer Science 2026-03-02 Siqiao Mu , Diego Klabjan

Cogradient Descent for Bilinear Optimization

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure. One reason lies in the insufficient training due to the asynchronous gradient…

Computer Vision and Pattern Recognition · Computer Science 2020-06-17 Li'an Zhuo , Baochang Zhang , Linlin Yang , Hanlin Chen , Qixiang Ye , David Doermann , Guodong Guo , Rongrong Ji

Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent

In Continual Learning settings, deep neural networks are prone to Catastrophic Forgetting. Orthogonal Gradient Descent was proposed to tackle the challenge. However, no theoretical guarantees have been proven yet. We present a theoretical…

Machine Learning · Statistics 2020-12-07 Mehdi Abbana Bennani , Thang Doan , Masashi Sugiyama

Data Debugging is NP-hard for Classifiers Trained with SGD

Data debugging is to find a subset of the training data such that the model obtained by retraining on the subset has a better accuracy. A bunch of heuristic approaches are proposed, however, none of them are guaranteed to solve this problem…

Computational Complexity · Computer Science 2024-08-05 Zizheng Guo , Pengyu Chen , Yanzhang Fu , Dongjing Miao

Gradient Correction beyond Gradient Descent

The great success neural networks have achieved is inseparable from the application of gradient-descent (GD) algorithms. Based on GD, many variant algorithms have emerged to improve the GD optimization process. The gradient for…

Machine Learning · Computer Science 2023-05-29 Zefan Li , Bingbing Ni , Teng Li , WenJun Zhang , Wen Gao

Descent-to-Delete: Gradient-Based Methods for Machine Unlearning

We study the data deletion problem for convex models. By leveraging techniques from convex optimization and reservoir sampling, we give the first data deletion algorithms that are able to handle an arbitrarily long sequence of adversarial…

Machine Learning · Statistics 2020-07-07 Seth Neel , Aaron Roth , Saeed Sharifi-Malvajerdi

Learning-to-Learn Stochastic Gradient Descent with Biased Regularization

We study the problem of learning-to-learn: inferring a learning algorithm that works well on tasks sampled from an unknown distribution. As class of algorithms we consider Stochastic Gradient Descent on the true risk regularized by the…

Machine Learning · Computer Science 2019-03-26 Giulia Denevi , Carlo Ciliberto , Riccardo Grazzi , Massimiliano Pontil