Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization

Corrado Coppola; Lorenzo Papa; Irene Amerini; Laura Palagi

Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization

Machine Learning 2024-12-17 v3 Optimization and Control

Authors: Corrado Coppola , Lorenzo Papa , Irene Amerini , Laura Palagi

Abstract

Adaptive gradient methods have been increasingly adopted by deep learning community due to their fast convergence and reduced sensitivity to hyper-parameters. However, these methods come with limitations, such as increased memory requirements for elements like moving averages and a poorly understood convergence theory. To overcome these challenges, we introduce F-CMA, a Fast-Controlled Mini-batch Algorithm with a random reshuffling method featuring a sufficient decrease condition and a line-search procedure to ensure loss reduction per epoch, along with its deterministic proof of global convergence to a stationary point. To evaluate the F-CMA, we integrate it into conventional training protocols for classification tasks involving both convolutional neural networks and vision transformer models, allowing for a direct comparison with popular optimizers. Computational tests show significant improvements, including a decrease in the overall training time by up to 68%, an increase in per-epoch efficiency by up to 20%, and in model accuracy by up to 5%.

Keywords

optimization sequence alignment stochastic gradient descent

Cite

@article{arxiv.2411.15795,
  title  = {Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization},
  author = {Corrado Coppola and Lorenzo Papa and Irene Amerini and Laura Palagi},
  journal= {arXiv preprint arXiv:2411.15795},
  year   = {2024}
}

Comments

There is an error in the literature review, in section 1. In particular, we noticed that there is a wrong citation, the [65], which has been erroneously associated with another author's claims

Beyond adaptive gradient: Fast-Controlled Minibatch Algorithm for large-scale optimization

Abstract

Keywords

Cite

Comments

Related papers