Related papers: Deep ReLU Programming

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

For fixed training data and network parameters in the other layers the L1 loss of a ReLU neural network as a function of the first layer's parameters is a piece-wise affine function. We use the Deep ReLU Simplex algorithm to iteratively…

Machine Learning · Statistics 2021-05-07 Peter Hinz

Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training

In a neural network with ReLU activations, the number of piecewise linear regions in the output can grow exponentially with depth. However, this is highly unlikely to happen when the initial parameters are sampled randomly, which therefore…

Machine Learning · Computer Science 2025-10-17 Max Milkert , David Hyde , Forrest Laine

The Evolution of the Interplay Between Input Distributions and Linear Regions in Networks

It is commonly recognized that the expressiveness of deep neural networks is contingent upon a range of factors, encompassing their depth, width, and other relevant considerations. Currently, the practical performance of the majority of…

Machine Learning · Computer Science 2023-11-08 Xuan Qi , Yi Wei

Locally Linear Attributes of ReLU Neural Networks

A ReLU neural network determines/is a continuous piecewise linear map from an input space to an output space. The weights in the neural network determine a decomposition of the input space into convex polytopes and on each of these…

Machine Learning · Computer Science 2020-12-04 Ben Sattelberg , Renzo Cavalieri , Michael Kirby , Chris Peterson , Ross Beveridge

ReLUs Are Sufficient for Learning Implicit Neural Representations

Motivated by the growing theoretical understanding of neural networks that employ the Rectified Linear Unit (ReLU) as their activation function, we revisit the use of ReLU activation functions for learning implicit neural representations…

Image and Video Processing · Electrical Eng. & Systems 2024-08-05 Joseph Shenouda , Yamin Zhou , Robert D. Nowak

Deep Representation with ReLU Neural Networks

We consider deep feedforward neural networks with rectified linear units from a signal processing perspective. In this view, such representations mark the transition from using a single (data-driven) linear representation to utilizing a…

Machine Learning · Computer Science 2019-04-01 Andreas Heinecke , Wen-Liang Hwang

Over-parametrized neural networks as under-determined linear systems

We draw connections between simple neural networks and under-determined linear systems to comprehensively explore several interesting theoretical questions in the study of neural networks. First, we emphatically show that it is unsurprising…

Numerical Analysis · Mathematics 2020-11-02 Austin R. Benson , Anil Damle , Alex Townsend

Limitations of neural network training due to numerical instability of backpropagation

We study the training of deep neural networks by gradient descent where floating-point arithmetic is used to compute the gradients. In this framework and under realistic assumptions, we demonstrate that it is highly unlikely to find ReLU…

Machine Learning · Computer Science 2023-11-16 Clemens Karner , Vladimir Kazeev , Philipp Christian Petersen

Diverse Neural Network Learns True Target Functions

Neural networks are a powerful class of functions that can be trained with simple gradient descent to achieve state-of-the-art performance on a variety of applications. Despite their practical success, there is a paucity of results that…

Machine Learning · Computer Science 2017-03-06 Bo Xie , Yingyu Liang , Le Song

Neural networks with trainable matrix activation functions

The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing…

Machine Learning · Computer Science 2024-10-29 Zhengqi Liu , Shuhao Cao , Yuwen Li , Ludmil Zikatanov

Practical Convex Formulation of Robust One-hidden-layer Neural Network Training

Recent work has shown that the training of a one-hidden-layer, scalar-output fully-connected ReLU neural network can be reformulated as a finite-dimensional convex program. Unfortunately, the scale of such a convex program grows…

Machine Learning · Computer Science 2021-05-27 Yatong Bai , Tanmay Gautam , Yu Gai , Somayeh Sojoudi

Using activation histograms to bound the number of affine regions in ReLU feed-forward neural networks

Several current bounds on the maximal number of affine regions of a ReLU feed-forward neural network are special cases of the framework [1] which relies on layer-wise activation histogram bounds. We analyze and partially solve a problem in…

Machine Learning · Statistics 2021-04-09 Peter Hinz

Dissecting Deep Neural Networks

In exchange for large quantities of data and processing power, deep neural networks have yielded models that provide state of the art predication capabilities in many fields. However, a lack of strong guarantees on their behaviour have…

Machine Learning · Computer Science 2020-01-22 Haakon Robinson , Adil Rasheed , Omer San

A Framework for the construction of upper bounds on the number of affine linear regions of ReLU feed-forward neural networks

We present a framework to derive upper bounds on the number of regions that feed-forward neural networks with ReLU activation functions are affine linear on. It is based on an inductive analysis that keeps track of the number of such…

Machine Learning · Statistics 2020-03-10 Peter Hinz , Sara van de Geer

Unveiling the Training Dynamics of ReLU Networks through a Linear Lens

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

Training invariances and the low-rank phenomenon: beyond linear networks

The implicit bias induced by the training of neural networks has become a topic of rigorous study. In the limit of gradient flow and gradient descent with appropriate step size, it has been shown that when one trains a deep linear network…

Machine Learning · Computer Science 2022-04-27 Thien Le , Stefanie Jegelka

ReLU Networks as Random Functions: Their Distribution in Probability Space

This paper presents a novel framework for understanding trained ReLU networks as random, affine functions, where the randomness is induced by the distribution over the inputs. By characterizing the probability distribution of the network's…

Machine Learning · Computer Science 2025-03-31 Shreyas Chaudhari , José M. F. Moura

Explicit Foundation Model Optimization with Self-Attentive Feed-Forward Neural Units

Iterative approximation methods using backpropagation enable the optimization of neural networks, but they remain computationally expensive, especially when used at scale. This paper presents an efficient alternative for optimizing neural…

Machine Learning · Computer Science 2023-11-14 Jake Ryland Williams , Haoran Zhao

A ReLU Dense Layer to Improve the Performance of Neural Networks

We propose ReDense as a simple and low complexity way to improve the performance of trained neural networks. We use a combination of random weights and rectified linear unit (ReLU) activation function to add a ReLU dense (ReDense) layer to…

Machine Learning · Computer Science 2020-10-27 Alireza M. Javid , Sandipan Das , Mikael Skoglund , Saikat Chatterjee

Reverse Engineering Deep ReLU Networks An Optimization-based Algorithm

Reverse engineering deep ReLU networks is a critical problem in understanding the complex behavior and interpretability of neural networks. In this research, we present a novel method for reconstructing deep ReLU networks by leveraging…

Machine Learning · Computer Science 2023-12-11 Mehrab Hamidi