English
Related papers

Related papers: Reverse Engineering Deep ReLU Networks An Optimiza…

200 papers

Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In…

Machine Learning · Computer Science 2022-01-14 Tolga Ergen , Mert Pilanci

Neural networks have shown tremendous potential for reconstructing high-resolution images in inverse problems. The non-convex and opaque nature of neural networks, however, hinders their utility in sensitive applications such as medical…

Machine Learning · Computer Science 2020-12-10 Arda Sahiner , Morteza Mardani , Batu Ozturkler , Mert Pilanci , John Pauly

The theory of deep learning focuses almost exclusively on supervised learning, non-convex optimization using stochastic gradient descent, and overparametrized neural networks. It is common belief that the optimizer dynamics, network…

Machine Learning · Computer Science 2022-02-18 Xinyi Chen , Edgar Minasyan , Jason D. Lee , Elad Hazan

We introduce and analyze a new technique for model reduction for deep neural networks. While large networks are theoretically capable of learning arbitrarily complex models, overfitting and model redundancy negatively affects the prediction…

Machine Learning · Computer Science 2017-11-27 Alireza Aghasi , Afshin Abdi , Nam Nguyen , Justin Romberg

The ongoing decarbonisation of power systems is driving an increasing reliance on distributed energy resources, which introduces complex and nonlinear interactions that are difficult to capture in conventional optimisation models. As a…

Systems and Control · Electrical Eng. & Systems 2026-01-22 Yogesh Pipada Sunil Kumar , S. Ali Pourmousavi , Jon A. R. Liisberg , Julian Lesmos-Vinasco

It has been widely assumed that a neural network cannot be recovered from its outputs, as the network depends on its parameters in a highly nonlinear way. Here, we prove that in fact it is often possible to identify the architecture,…

Machine Learning · Computer Science 2020-02-25 David Rolnick , Konrad P. Kording

Solving non-convex, NP-hard optimization problems is crucial for training machine learning models, including neural networks. However, non-convexity often leads to black-box machine learning models with unclear inner workings. While convex…

Machine Learning · Computer Science 2025-03-18 Karthik Prakhya , Tolga Birdal , Alp Yurtsever

Understanding the fundamental principles behind the success of deep neural networks is one of the most important open questions in the current literature. To this end, we study the training problem of deep neural networks and introduce an…

Machine Learning · Computer Science 2023-09-27 Tolga Ergen , Mert Pilanci

When solving decision-making problems with mathematical optimization, some constraints or objectives may lack analytic expressions but can be approximated from data. When an approximation is made by neural networks, the underlying problem…

Optimization and Control · Mathematics 2025-03-25 Xinwei Liu , Vladimir Dvorkin

Deep neural networks (DNNs), particularly those using Rectified Linear Unit (ReLU) activation functions, have achieved remarkable success across diverse machine learning tasks, including image recognition, audio processing, and language…

Machine Learning · Computer Science 2026-03-26 Emi Zeger , Mert Pilanci

Recent work has shown that the training of a one-hidden-layer, scalar-output fully-connected ReLU neural network can be reformulated as a finite-dimensional convex program. Unfortunately, the scale of such a convex program grows…

Machine Learning · Computer Science 2021-05-27 Yatong Bai , Tanmay Gautam , Yu Gai , Somayeh Sojoudi

We develop fast algorithms and robust software for convex optimization of two-layer neural networks with ReLU activation functions. Our work leverages a convex reformulation of the standard weight-decay penalized training problem as a set…

Machine Learning · Computer Science 2025-04-10 Aaron Mishkin , Arda Sahiner , Mert Pilanci

We develop a convex analytic approach to analyze finite width two-layer ReLU networks. We first prove that an optimal solution to the regularized training problem can be characterized as extreme points of a convex set, where simple…

Machine Learning · Computer Science 2021-09-01 Tolga Ergen , Mert Pilanci

Deep learning-based models have demonstrated remarkable success in solving illposed inverse problems; however, many fail to strictly adhere to the physical constraints imposed by the measurement process. In this work, we introduce a…

Machine Learning · Computer Science 2025-05-22 Jorge Bacca

We prove that finding all globally optimal two-layer ReLU neural networks can be performed by solving a convex optimization program with cone constraints. Our analysis is novel, characterizes all optimal solutions, and does not leverage…

Machine Learning · Computer Science 2022-03-15 Yifei Wang , Jonathan Lacotte , Mert Pilanci

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

Deep learning, in the form of artificial neural networks, has achieved remarkable practical success in recent years, for a variety of difficult machine learning applications. However, a theoretical explanation for this remains a major open…

Machine Learning · Computer Science 2016-06-15 Itay Safran , Ohad Shamir

The success of deep neural networks hinges on our ability to accurately and efficiently optimize high-dimensional, non-convex functions. In this paper, we empirically investigate the loss functions of state-of-the-art networks, and how…

Machine Learning · Computer Science 2017-12-11 Daniel Jiwoong Im , Michael Tao , Kristin Branson

This theoretical paper is devoted to developing a rigorous theory for demystifying the global convergence phenomenon in a challenging scenario: learning over-parameterized Rectified Linear Unit (ReLU) nets for very high dimensional dataset…

Machine Learning · Computer Science 2022-06-08 Peng He

The non-convexity of the artificial neural network (ANN) training landscape brings inherent optimization difficulties. While the traditional back-propagation stochastic gradient descent (SGD) algorithm and its variants are effective in…

Machine Learning · Computer Science 2025-06-18 Yatong Bai , Tanmay Gautam , Somayeh Sojoudi
‹ Prev 1 2 3 10 Next ›