Related papers: Reverse Engineering Deep ReLU Networks An Optimiza…

Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In…

Machine Learning · Computer Science 2022-01-14 Tolga Ergen , Mert Pilanci

Convex Regularization Behind Neural Reconstruction

Neural networks have shown tremendous potential for reconstructing high-resolution images in inverse problems. The non-convex and opaque nature of neural networks, however, hinders their utility in sensitive applications such as medical…

Machine Learning · Computer Science 2020-12-10 Arda Sahiner , Morteza Mardani , Batu Ozturkler , Mert Pilanci , John Pauly

Provable Regret Bounds for Deep Online Learning and Control

The theory of deep learning focuses almost exclusively on supervised learning, non-convex optimization using stochastic gradient descent, and overparametrized neural networks. It is common belief that the optimizer dynamics, network…

Machine Learning · Computer Science 2022-02-18 Xinyi Chen , Edgar Minasyan , Jason D. Lee , Elad Hazan

Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee

We introduce and analyze a new technique for model reduction for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction…

Machine Learning · Computer Science 2017-11-27 Alireza Aghasi , Afshin Abdi , Nam Nguyen , Justin Romberg

Efficient reformulations of ReLU deep neural networks for surrogate modelling in power system optimisation

The ongoing decarbonisation of power systems is driving an increasing reliance on distributed energy resources, which introduces complex and nonlinear interactions that are difficult to capture in conventional optimisation models. As a…

Systems and Control · Electrical Eng. & Systems 2026-01-22 Yogesh Pipada Sunil Kumar , S. Ali Pourmousavi , Jon A. R. Liisberg , Julian Lesmos-Vinasco

Reverse-Engineering Deep ReLU Networks

It has been widely assumed that a neural network cannot be recovered from its outputs, as the network depends on its parameters in a highly nonlinear way. Here, we prove that in fact it is often possible to identify the architecture,…

Machine Learning · Computer Science 2020-02-25 David Rolnick , Konrad P. Kording

Convex Formulations for Training Two-Layer ReLU Neural Networks

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex…

Machine Learning · Computer Science 2025-03-18 Karthik Prakhya , Tolga Birdal , Alp Yurtsever

Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks

Understanding the fundamental principles behind the success of deep neural networks is one of the most important open questions in the current literature. To this end, we study the training problem of deep neural networks and introduce an…

Machine Learning · Computer Science 2023-09-27 Tolga Ergen , Mert Pilanci

Optimization over Trained Neural Networks: Difference-of-Convex Algorithm and Application to Data Center Scheduling

When solving decision-making problems with mathematical optimization, some constraints or objectives may lack analytic expressions but can be approximated from data. When an approximation is made by neural networks, the underlying problem…

Optimization and Control · Mathematics 2025-03-25 Xinwei Liu , Vladimir Dvorkin

Unveiling Hidden Convexity in Deep Learning: a Sparse Signal Processing Perspective

Deep neural networks (DNNs), particularly those using Rectified Linear Unit (ReLU) activation functions, have achieved remarkable success across diverse machine learning tasks, including image recognition, audio processing, and language…

Machine Learning · Computer Science 2026-03-26 Emi Zeger , Mert Pilanci

Practical Convex Formulation of Robust One-hidden-layer Neural Network Training

Recent work has shown that the training of a one-hidden-layer, scalar-output fully-connected ReLU neural network can be reformulated as a finite-dimensional convex program. Unfortunately, the scale of such a convex program grows…

Machine Learning · Computer Science 2021-05-27 Yatong Bai , Tanmay Gautam , Yu Gai , Somayeh Sojoudi

Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions

We develop fast algorithms and robust software for convex optimization of two-layer neural networks with ReLU activation functions. Our work leverages a convex reformulation of the standard weight-decay penalized training problem as a set…

Machine Learning · Computer Science 2025-04-10 Aaron Mishkin , Arda Sahiner , Mert Pilanci

Convex Geometry and Duality of Over-parameterized Neural Networks

We develop a convex analytic approach to analyze finite width two-layer ReLU networks. We first prove that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set, where simple…

Machine Learning · Computer Science 2021-09-01 Tolga Ergen , Mert Pilanci

Projection-Based Correction for Enhancing Deep Inverse Networks

Deep learning-based models have demonstrated remarkable success in solving illposed inverse problems; however, many fail to strictly adhere to the physical constraints imposed by the measurement process. In this work, we introduce a…

Machine Learning · Computer Science 2025-05-22 Jorge Bacca

The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions

We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints. Our analysis is novel, characterizes all optimal solutions, and does not leverage…

Machine Learning · Computer Science 2022-03-15 Yifei Wang , Jonathan Lacotte , Mert Pilanci

Unveiling the Training Dynamics of ReLU Networks through a Linear Lens

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

On the Quality of the Initial Basin in Overspecified Neural Networks

Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open…

Machine Learning · Computer Science 2016-06-15 Itay Safran , Ohad Shamir

An empirical analysis of the optimization of deep network loss surfaces

The success of deep neural networks hinges on our ability to accurately and efficiently optimize high-dimensional, non-convex functions. In this paper, we empirically investigate the loss functions of state-of-the-art networks, and how…

Machine Learning · Computer Science 2017-12-11 Daniel Jiwoong Im , Michael Tao , Kristin Branson

Demystifying the Global Convergence Puzzle of Learning Over-parameterized ReLU Nets in Very High Dimensions

This theoretical paper is devoted to developing a rigorous theory for demystifying the global convergence phenomenon in a challenging scenario: learning over-parameterized Rectified Linear Unit (ReLU) nets for very high dimensional dataset…

Machine Learning · Computer Science 2022-06-08 Peng He

Efficient Global Optimization of Two-Layer ReLU Networks: Quadratic-Time Algorithms and Adversarial Training

The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in…

Machine Learning · Computer Science 2025-06-18 Yatong Bai , Tanmay Gautam , Somayeh Sojoudi