Related papers: Static Activation Function Normalization

Neural networks with trainable matrix activation functions

The training process of neural networks usually optimize weights and bias parameters of linear transformations, while nonlinear activation functions are pre-specified and fixed. This work develops a systematic approach to constructing…

Machine Learning · Computer Science 2024-10-29 Zhengqi Liu , Shuhao Cao , Yuwen Li , Ludmil Zikatanov

Regularized Flexible Activation Function Combinations for Deep Neural Networks

Activation in deep neural networks is fundamental to achieving non-linear mappings. Traditional studies mainly focus on finding fixed activations for a particular set of learning tasks or model architectures. The research on flexible…

Neural and Evolutionary Computing · Computer Science 2020-08-20 Renlong Jie , Junbin Gao , Andrey Vasnev , Min-ngoc Tran

Nonlinearity Enhanced Adaptive Activation Functions

A general procedure for introducing parametric, learned, nonlinearity into activation functions is found to enhance the accuracy of representative neural networks without requiring significant additional computational resources. Examples…

Machine Learning · Computer Science 2025-05-14 David Yevick

Optimization Theory for ReLU Neural Networks Trained with Normalization Layers

The success of deep neural networks is in part due to the use of normalization layers. Normalization layers like Batch Normalization, Layer Normalization and Weight Normalization are ubiquitous in practice, as they improve generalization…

Machine Learning · Computer Science 2020-06-15 Yonatan Dukler , Quanquan Gu , Guido Montúfar

Normalization-Equivariant Neural Networks with Application to Image Denoising

In many information processing systems, it may be desirable to ensure that any change of the input, whether by shifting or scaling, results in a corresponding change in the system response. While deep neural networks are gradually replacing…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Sébastien Herbreteau , Emmanuel Moebel , Charles Kervrann

SMU: smooth activation function for deep networks using smoothing maximum technique

Deep learning researchers have a keen interest in proposing two new novel activation functions which can boost network performance. A good choice of activation function can have significant consequences in improving network performance. A…

Machine Learning · Computer Science 2022-04-12 Koushik Biswas , Sandeep Kumar , Shilpak Banerjee , Ashish Kumar Pandey

Cooperative Initialization based Deep Neural Network Training

Researchers have proposed various activation functions. These activation functions help the deep network to learn non-linear behavior with a significant effect on training dynamics and task performance. The performance of these activations…

Computer Vision and Pattern Recognition · Computer Science 2020-01-07 Pravendra Singh , Munender Varshney , Vinay P. Namboodiri

Activations Through Extensions: A Framework To Boost Performance Of Neural Networks

Activation functions are non-linearities in neural networks that allow them to learn complex mapping between inputs and outputs. Typical choices for activation functions are ReLU, Tanh, Sigmoid etc., where the choice generally depends on…

Machine Learning · Computer Science 2024-08-19 Chandramouli Kamanchi , Sumanta Mukherjee , Kameshwaran Sampath , Pankaj Dayama , Arindam Jati , Vijay Ekambaram , Dzung Phan

Developing Training Procedures for Piecewise-linear Spline Activation Functions in Neural Networks

Activation functions in neural networks are typically selected from a set of empirically validated, commonly used static functions such as ReLU, tanh, or sigmoid. However, by optimizing the shapes of a network's activation functions, we can…

Machine Learning · Computer Science 2025-09-24 William H Patty

Activation Functions: Do They Represent A Trade-Off Between Modular Nature of Neural Networks And Task Performance

Current research suggests that the key factors in designing neural network architectures involve choosing number of filters for every convolution layer, number of hidden neurons for every fully connected layer, dropout and pruning. The…

Machine Learning · Computer Science 2020-09-17 Himanshu Pradeep Aswani , Amit Sethi

Learning Activation Functions: A new paradigm for understanding Neural Networks

The scope of research in the domain of activation functions remains limited and centered around improving the ease of optimization or generalization quality of neural networks (NNs). However, to develop a deeper understanding of deep…

Machine Learning · Computer Science 2020-12-10 Mohit Goyal , Rajan Goyal , Brejesh Lall

Training Deep Neural Networks Without Batch Normalization

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Activation Functions in Artificial Neural Networks: A Systematic Overview

Activation functions shape the outputs of artificial neurons and, therefore, are integral parts of neural networks in general and deep learning in particular. Some activation functions, such as logistic and relu, have been used for many…

Machine Learning · Computer Science 2021-01-26 Johannes Lederer

Layer Normalization

Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the…

Machine Learning · Statistics 2016-07-22 Jimmy Lei Ba , Jamie Ryan Kiros , Geoffrey E. Hinton

Rational neural networks

We consider neural networks with rational activation functions. The choice of the nonlinear activation function in deep learning architectures is crucial and heavily impacts the performance of a neural network. We establish optimal bounds…

Neural and Evolutionary Computing · Computer Science 2020-10-01 Nicolas Boullé , Yuji Nakatsukasa , Alex Townsend

Soft-Root-Sign Activation Function

The choice of activation function in deep networks has a significant effect on the training dynamics and task performance. At present, the most effective and widely-used activation function is ReLU. However, because of the non-zero mean,…

Computer Vision and Pattern Recognition · Computer Science 2020-03-03 Yuan Zhou , Dandan Li , Shuwei Huo , Sun-Yuan Kung

Adaptively Customizing Activation Functions for Various Layers

To enhance the nonlinearity of neural networks and increase their mapping abilities between the inputs and response variables, activation functions play a crucial role to model more complex relationships and patterns in the data. In this…

Computer Vision and Pattern Recognition · Computer Science 2021-12-20 Haigen Hu , Aizhu Liu , Qiu Guan , Xiaoxin Li , Shengyong Chen , Qianwei Zhou

Natural-Logarithm-Rectified Activation Function in Convolutional Neural Networks

Activation functions play a key role in providing remarkable performance in deep neural networks, and the rectified linear unit (ReLU) is one of the most widely used activation functions. Various new activation functions and improvements on…

Machine Learning · Computer Science 2019-08-27 Yang Liu , Jianpeng Zhang , Chao Gao , Jinghua Qu , Lixin Ji

Smooth activations and reproducibility in deep networks

Deep networks are gradually penetrating almost every domain in our lives due to their amazing success. However, with substantive performance accuracy improvements comes the price of \emph{irreproducibility}. Two identical models, trained on…

Machine Learning · Computer Science 2020-12-02 Gil I. Shamir , Dong Lin , Lorenzo Coviello

Activation Functions: Dive into an optimal activation function

Activation functions have come up as one of the essential components of neural networks. The choice of adequate activation function can impact the accuracy of these methods. In this study, we experiment for finding an optimal activation…

Machine Learning · Computer Science 2022-02-25 Vipul Bansal