English

Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning

Optimization and Control 2025-03-14 v1 Machine Learning

Abstract

Training deep neural networks is challenging. To accelerate training and enhance performance, we propose PadamP, a novel optimization algorithm. PadamP is derived by applying the adaptive estimation of the p-th power of the second-order moments under scale invariance, enhancing projection adaptability by modifying the projection discrimination condition. It is integrated into Adam-type algorithms, accelerating training, boosting performance, and improving generalization in deep learning. Combining projected gradient benefits with adaptive moment estimation, PadamP tackles unconstrained non-convex problems. Convergence for the non-convex case is analyzed, focusing on the decoupling of first-order moment estimation coefficients and second-order moment estimation coefficients. Unlike prior work relying on , our proof generalizes the convergence theorem, enhancing practicality. Experiments using VGG-16 and ResNet-18 on CIFAR-10 and CIFAR-100 show PadamP's effectiveness, with notable performance on CIFAR-10/100, especially for VGG-16. The results demonstrate that PadamP outperforms existing algorithms in terms of convergence speed and generalization ability, making it a valuable addition to the field of deep learning optimization.

Keywords

Cite

@article{arxiv.2503.10005,
  title  = {Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning},
  author = {Yongqi Li and Xiaowei Zhang},
  journal= {arXiv preprint arXiv:2503.10005},
  year   = {2025}
}
R2 v1 2026-06-28T22:18:31.489Z