Related papers: Decoupling Gating from Linearity

$\lambda$-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks

Gaussian Error Linear Unit (GELU) is a widely used smooth alternative to Rectifier Linear Unit (ReLU), yet many deployment, compression, and analysis toolchains are most naturally expressed for piecewise-linear (ReLU-type) networks. We…

Machine Learning · Computer Science 2026-04-06 Cristian Pérez-Corral , Alberto Fernández-Hernández , Jose I. Mestre , Manuel F. Dolz , Enrique S. Quintana-Ortí

Gated Linear Networks

This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism;…

Machine Learning · Computer Science 2020-06-12 Joel Veness , Tor Lattimore , David Budden , Avishkar Bhoopchand , Christopher Mattern , Agnieszka Grabska-Barwinska , Eren Sezener , Jianan Wang , Peter Toth , Simon Schmitt , Marcus Hutter

Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization

Neural networks with REctified Linear Unit (ReLU) activation functions (a.k.a. ReLU networks) have achieved great empirical success in various domains. Nonetheless, existing results for learning ReLU networks either pose assumptions on the…

Machine Learning · Statistics 2019-05-01 Gang Wang , Georgios B. Giannakis , Jie Chen

Deep Gated Networks: A framework to understand training and generalisation in deep learning

Understanding the role of (stochastic) gradient descent (SGD) in the training and generalisation of deep neural networks (DNNs) with ReLU activation has been the object study in the recent past. In this paper, we make use of deep gated…

Machine Learning · Computer Science 2020-03-03 Chandrashekar Lakshminarayanan , Amit Vikram Singh

Disentangling deep neural networks with rectified linear units using duality

Despite their success deep neural networks (DNNs) are still largely considered as black boxes. The main issue is that the linear and non-linear operations are entangled in every layer, making it hard to interpret the hidden layer outputs.…

Machine Learning · Computer Science 2021-10-08 Chandrashekar Lakshminarayanan , Amit Vikram Singh

Properties of the geometry of solutions and capacity of multi-layer neural networks with Rectified Linear Units activations

Rectified Linear Units (ReLU) have become the main model for the neural units in current deep learning systems. This choice has been originally suggested as a way to compensate for the so called vanishing gradient problem which can undercut…

Disordered Systems and Neural Networks · Physics 2024-05-06 Carlo Baldassi , Enrico M. Malatesta , Riccardo Zecchina

Singular Values for ReLU Layers

Despite their prevalence in neural networks we still lack a thorough theoretical characterization of ReLU layers. This paper aims to further our understanding of ReLU layers by studying how the activation function ReLU interacts with the…

Machine Learning · Computer Science 2019-08-13 Sören Dittmer , Emily J. King , Peter Maass

The Geometry of ReLU Networks through the ReLU Transition Graph

We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of a combinatorial object we term the ReLU Transition Graph (RTG). In this graph, each node corresponds to a linear region induced by the network's…

Machine Learning · Computer Science 2025-05-30 Sahil Rajesh Dhayalkar

Expressiveness of Rectifier Networks

Rectified Linear Units (ReLUs) have been shown to ameliorate the vanishing gradient problem, allow for efficient backpropagation, and empirically promote sparsity in the learned parameters. They have led to state-of-the-art results in a…

Machine Learning · Computer Science 2016-05-30 Xingyuan Pan , Vivek Srikumar

Unveiling the Training Dynamics of ReLU Networks through a Linear Lens

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

Globally Gated Deep Linear Networks

Recently proposed Gated Linear Networks present a tractable nonlinear network architecture, and exhibit interesting capabilities such as learning with local error signals and reduced forgetting in sequential learning. In this work, we…

Machine Learning · Computer Science 2022-12-13 Qianyi Li , Haim Sompolinsky

Deep vs. shallow networks : An approximation theory perspective

The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation…

Machine Learning · Computer Science 2016-08-12 Hrushikesh Mhaskar , Tomaso Poggio

Understanding Deep Neural Networks with Rectified Linear Units

In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU). We give an algorithm to train a ReLU DNN with one hidden layer to *global optimality* with runtime…

Machine Learning · Computer Science 2018-03-01 Raman Arora , Amitabh Basu , Poorya Mianjy , Anirbit Mukherjee

Minimal Gated Unit for Recurrent Neural Networks

Recently recurrent neural networks (RNN) has been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN is a difficult task, partly because there are many competing and complex hidden…

Neural and Evolutionary Computing · Computer Science 2016-04-01 Guo-Bing Zhou , Jianxin Wu , Chen-Lin Zhang , Zhi-Hua Zhou

Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning

Rectified linear unit (ReLU) activations can also be thought of as 'gates', which, either pass or stop their pre-activation input when they are 'on' (when the pre-activation input is positive) or 'off' (when the pre-activation input is…

Machine Learning · Computer Science 2021-06-15 Chandrashekar Lakshminarayanan , Amit Vikram Singh

The Role of Neural Network Activation Functions

A wide variety of activation functions have been proposed for neural networks. The Rectified Linear Unit (ReLU) is especially popular today. There are many practical reasons that motivate the use of the ReLU. This paper provides new…

Machine Learning · Statistics 2020-10-19 Rahul Parhi , Robert D. Nowak

IGLU: The Integrated Gaussian Linear Unit Activation Function

Activation functions are fundamental to deep neural networks, governing gradient flow, optimization stability, and representational capacity. Within historic deep architectures, while ReLU has been the dominant choice for the activation…

Machine Learning · Computer Science 2026-03-10 Mingi Kang , Zai Yang , Jeova Farias Sales Rocha Neto

Large Deviations of Gaussian Neural Networks with ReLU activation

We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU. This generalises earlier work, in which bounded and continuous activation functions were…

Machine Learning · Statistics 2026-02-10 Quirin Vogel

Generalized Activation via Multivariate Projection

Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness. Motivated by the structural similarity between a shallow…

Machine Learning · Computer Science 2024-01-30 Jiayun Li , Yuxiao Cheng , Yiwen Lu , Zhuofan Xia , Yilin Mo , Gao Huang

Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units

Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e.g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states). However, the…

Computer Vision and Pattern Recognition · Computer Science 2020-05-27 Zhanzhan Cheng , Yunlu Xu , Mingjian Cheng , Yu Qiao , Shiliang Pu , Yi Niu , Fei Wu