Related papers: Qualitatively characterizing neural network optimi…

Provably Training Overparameterized Neural Network Classifiers with Non-convex Constraints

Training a classifier under non-convex constraints has gotten increasing attention in the machine learning community thanks to its wide range of applications such as algorithmic fairness and class-imbalanced classification. However, several…

Machine Learning · Statistics 2022-10-31 You-Lin Chen , Zhaoran Wang , Mladen Kolar

Globally Optimal Training of Generalized Polynomial Neural Networks with Nonlinear Spectral Methods

The optimization problem behind neural networks is highly non-convex. Training with stochastic gradient descent and variants requires careful parameter tuning and provides no guarantee to achieve the global optimum. In contrast we show…

Machine Learning · Computer Science 2016-10-31 Antoine Gautier , Quynh Nguyen , Matthias Hein

Towards Understanding the Importance of Noise in Training Neural Networks

Numerous empirical evidence has corroborated that the noise plays a crucial rule in effective and efficient training of neural networks. The theory behind, however, is still largely unknown. This paper studies this fundamental problem…

Machine Learning · Computer Science 2019-09-10 Mo Zhou , Tianyi Liu , Yan Li , Dachao Lin , Enlu Zhou , Tuo Zhao

Why Do Local Methods Solve Nonconvex Problems?

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the…

Machine Learning · Computer Science 2021-03-26 Tengyu Ma

Learning Hard Optimization Problems: A Data Generation Perspective

Optimization problems are ubiquitous in our societies and are present in almost every segment of the economy. Most of these optimization problems are NP-hard and computationally demanding, often requiring approximate solutions for…

Optimization and Control · Mathematics 2021-06-23 James Kotary , Ferdinando Fioretto , Pascal Van Hentenryck

The loss surface of deep and wide neural networks

While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is…

Machine Learning · Computer Science 2017-06-14 Quynh Nguyen , Matthias Hein

Algorithms for solving optimization problems arising from deep neural net models: nonsmooth problems

Machine Learning models incorporating multiple layered learning networks have been seen to provide effective models for various classification problems. The resulting optimization problem to solve for the optimal vector minimizing the…

Optimization and Control · Mathematics 2018-07-03 Vyacheslav Kungurtsev , Tomas Pevny

Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks

Training deep neural networks for solving machine learning problems is one great challenge in the field, mainly due to its associated optimisation problem being highly non-convex. Recent developments have suggested that many training…

Machine Learning · Computer Science 2017-11-23 Hao Shen

Training Neural Networks at Any Scale

This article reviews modern optimization methods for training neural networks with an emphasis on efficiency and scale. We present state-of-the-art optimization algorithms under a unified algorithmic template that highlights the importance…

Machine Learning · Computer Science 2025-11-17 Thomas Pethick , Kimon Antonakopoulos , Antonio Silveti-Falls , Leena Chennuru Vankadara , Volkan Cevher

Stochastic Training of Neural Networks via Successive Convex Approximations

This paper proposes a new family of algorithms for training neural networks (NNs). These are based on recent developments in the field of non-convex optimization, going under the general name of successive convex approximation (SCA)…

Machine Learning · Statistics 2017-06-16 Simone Scardapane , Paolo Di Lorenzo

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of…

Machine Learning · Computer Science 2016-01-13 Majid Janzamin , Hanie Sedghi , Anima Anandkumar

Porcupine Neural Networks: (Almost) All Local Optima are Global

Neural networks have been used prominently in several machine learning and statistics applications. In general, the underlying optimization of neural networks is non-convex which makes their performance analysis challenging. In this paper,…

Machine Learning · Statistics 2017-10-09 Soheil Feizi , Hamid Javadi , Jesse Zhang , David Tse

A Variational Inequality Model for Learning Neural Networks

Neural networks have become ubiquitous tools for solving signal and image processing problems, and they often outperform standard approaches. Nevertheless, training neural networks is a challenging task in many applications. The prevalent…

Optimization and Control · Mathematics 2022-10-28 Patrick L. Combettes , Jean-Christophe Pesquet , Audrey Repetti

How degenerate is the parametrization of neural networks with the ReLU activation function?

Neural network training is usually accomplished by solving a non-convex optimization problem using stochastic gradient descent. Although one optimizes over the networks parameters, the main loss function generally only depends on the…

Machine Learning · Computer Science 2023-02-10 Julius Berner , Dennis Elbrächter , Philipp Grohs

Non-Convex Optimizations for Machine Learning with Theoretical Guarantee: Robust Matrix Completion and Neural Network Learning

Despite the recent development in machine learning, most learning systems are still under the concept of "black box", where the performance cannot be understood and derived. With the rise of safety and privacy concerns in public, designing…

Machine Learning · Computer Science 2023-06-30 Shuai Zhang

Differentiable Convex Optimization Layers in Neural Architectures: Foundations and Perspectives

The integration of optimization problems within neural network architectures represents a fundamental shift from traditional approaches to handling constraints in deep learning. While it is long known that neural networks can incorporate…

Machine Learning · Computer Science 2024-12-31 Calder Katyal

On Optimizing Deep Convolutional Neural Networks by Evolutionary Computing

Optimization for deep networks is currently a very active area of research. As neural networks become deeper, the ability in manually optimizing the network becomes harder. Mini-batch normalization, identification of effective respective…

Neural and Evolutionary Computing · Computer Science 2018-08-07 M. U. B. Dias , D. D. N. De Silva , S. Fernando

Non-convex Optimization for Machine Learning

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low…

Machine Learning · Statistics 2017-12-22 Prateek Jain , Purushottam Kar

Training Linear Neural Networks: Non-Local Convergence and Complexity Results

Linear networks provide valuable insights into the workings of neural networks in general. This paper identifies conditions under which the gradient flow provably trains a linear network, in spite of the non-strict saddle points present in…

Optimization and Control · Mathematics 2020-06-30 Armin Eftekhari

Optimization for deep learning: theory and algorithms

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more…

Machine Learning · Computer Science 2019-12-21 Ruoyu Sun