English
Related papers

Related papers: Decoupling Gating from Linearity

200 papers

Gaussian Error Linear Unit (GELU) is a widely used smooth alternative to Rectifier Linear Unit (ReLU), yet many deployment, compression, and analysis toolchains are most naturally expressed for piecewise-linear (ReLU-type) networks. We…

This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs). What distinguishes GLNs from contemporary neural networks is the distributed and local nature of their credit assignment mechanism;…

Neural networks with REctified Linear Unit (ReLU) activation functions (a.k.a. ReLU networks) have achieved great empirical success in various domains. Nonetheless, existing results for learning ReLU networks either pose assumptions on the…

Machine Learning · Statistics 2019-05-01 Gang Wang , Georgios B. Giannakis , Jie Chen

Understanding the role of (stochastic) gradient descent (SGD) in the training and generalisation of deep neural networks (DNNs) with ReLU activation has been the object study in the recent past. In this paper, we make use of deep gated…

Machine Learning · Computer Science 2020-03-03 Chandrashekar Lakshminarayanan , Amit Vikram Singh

Despite their success deep neural networks (DNNs) are still largely considered as black boxes. The main issue is that the linear and non-linear operations are entangled in every layer, making it hard to interpret the hidden layer outputs.…

Machine Learning · Computer Science 2021-10-08 Chandrashekar Lakshminarayanan , Amit Vikram Singh

Rectified Linear Units (ReLU) have become the main model for the neural units in current deep learning systems. This choice has been originally suggested as a way to compensate for the so called vanishing gradient problem which can undercut…

Disordered Systems and Neural Networks · Physics 2024-05-06 Carlo Baldassi , Enrico M. Malatesta , Riccardo Zecchina

Despite their prevalence in neural networks we still lack a thorough theoretical characterization of ReLU layers. This paper aims to further our understanding of ReLU layers by studying how the activation function ReLU interacts with the…

Machine Learning · Computer Science 2019-08-13 Sören Dittmer , Emily J. King , Peter Maass

We develop a novel theoretical framework for analyzing ReLU neural networks through the lens of a combinatorial object we term the ReLU Transition Graph (RTG). In this graph, each node corresponds to a linear region induced by the network's…

Machine Learning · Computer Science 2025-05-30 Sahil Rajesh Dhayalkar

Rectified Linear Units (ReLUs) have been shown to ameliorate the vanishing gradient problem, allow for efficient backpropagation, and empirically promote sparsity in the learned parameters. They have led to state-of-the-art results in a…

Machine Learning · Computer Science 2016-05-30 Xingyuan Pan , Vivek Srikumar

Deep neural networks, particularly those employing Rectified Linear Units (ReLU), are often perceived as complex, high-dimensional, non-linear systems. This complexity poses a significant challenge to understanding their internal learning…

Machine Learning · Computer Science 2025-11-11 Longqing Ye

Recently proposed Gated Linear Networks present a tractable nonlinear network architecture, and exhibit interesting capabilities such as learning with local error signals and reduced forgetting in sequential learning. In this work, we…

Machine Learning · Computer Science 2022-12-13 Qianyi Li , Haim Sompolinsky

The paper briefy reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in function approximation…

Machine Learning · Computer Science 2016-08-12 Hrushikesh Mhaskar , Tomaso Poggio

In this paper we investigate the family of functions representable by deep neural networks (DNN) with rectified linear units (ReLU). We give an algorithm to train a ReLU DNN with one hidden layer to *global optimality* with runtime…

Machine Learning · Computer Science 2018-03-01 Raman Arora , Amitabh Basu , Poorya Mianjy , Anirbit Mukherjee

Recently recurrent neural networks (RNN) has been very successful in handling sequence data. However, understanding RNN and finding the best practices for RNN is a difficult task, partly because there are many competing and complex hidden…

Neural and Evolutionary Computing · Computer Science 2016-04-01 Guo-Bing Zhou , Jianxin Wu , Chen-Lin Zhang , Zhi-Hua Zhou

Rectified linear unit (ReLU) activations can also be thought of as 'gates', which, either pass or stop their pre-activation input when they are 'on' (when the pre-activation input is positive) or 'off' (when the pre-activation input is…

Machine Learning · Computer Science 2021-06-15 Chandrashekar Lakshminarayanan , Amit Vikram Singh

A wide variety of activation functions have been proposed for neural networks. The Rectified Linear Unit (ReLU) is especially popular today. There are many practical reasons that motivate the use of the ReLU. This paper provides new…

Machine Learning · Statistics 2020-10-19 Rahul Parhi , Robert D. Nowak

Activation functions are fundamental to deep neural networks, governing gradient flow, optimization stability, and representational capacity. Within historic deep architectures, while ReLU has been the dominant choice for the activation…

Machine Learning · Computer Science 2026-03-10 Mingi Kang , Zai Yang , Jeova Farias Sales Rocha Neto

We prove a large deviation principle for deep neural networks with Gaussian weights and at most linearly growing activation functions, such as ReLU. This generalises earlier work, in which bounded and continuous activation functions were…

Machine Learning · Statistics 2026-02-10 Quirin Vogel

Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness. Motivated by the structural similarity between a shallow…

Machine Learning · Computer Science 2024-01-30 Jiayun Li , Yuxiao Cheng , Yiwen Lu , Zhuofan Xia , Yilin Mo , Gao Huang

Recurrent neural network (RNN) has been widely studied in sequence learning tasks, while the mainstream models (e.g., LSTM and GRU) rely on the gating mechanism (in control of how information flows between hidden states). However, the…

Computer Vision and Pattern Recognition · Computer Science 2020-05-27 Zhanzhan Cheng , Yunlu Xu , Mingjian Cheng , Yu Qiao , Shiliang Pu , Yi Niu , Fei Wu
‹ Prev 1 2 3 10 Next ›