Related papers: Hysteresis Activation Function for Efficient Infer…

TeLU Activation Function for Fast and Stable Deep Learning

We propose the Hyperbolic Tangent Exponential Linear Unit (TeLU), a neural network hidden activation function defined as TeLU(x)=xtanh(exp(x)). TeLU's design is grounded in the core principles of key activation functions, achieving strong…

Machine Learning · Computer Science 2025-01-03 Alfredo Fernandez , Ankur Mali

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

The application of the deep learning model in classification plays an important role in the accurate detection of the target objects. However, the accuracy is affected by the activation function in the hidden and output layer. In this…

Computer Vision and Pattern Recognition · Computer Science 2023-05-22 Md. Mehedi Hasan , Md. Ali Hossain , Azmain Yakin Srizon , Abu Sayeed

ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models

Large Language Models (LLMs) with billions of parameters have drastically transformed AI applications. However, their demanding computation during inference has raised significant challenges for deployment on resource-constrained devices.…

Machine Learning · Computer Science 2023-10-10 Iman Mirzadeh , Keivan Alizadeh , Sachin Mehta , Carlo C Del Mundo , Oncel Tuzel , Golnoosh Samei , Mohammad Rastegari , Mehrdad Farajtabar

FReLU: Flexible Rectified Linear Units for Improving Convolutional Neural Networks

Rectified linear unit (ReLU) is a widely used activation function for deep convolutional neural networks. However, because of the zero-hard rectification, ReLU networks miss the benefits from negative values. In this paper, we propose a…

Computer Vision and Pattern Recognition · Computer Science 2018-01-30 Suo Qiu , Xiangmin Xu , Bolun Cai

GELU Activation Function in Deep Learning: A Comprehensive Mathematical Analysis and Performance

Selecting the most suitable activation function is a critical factor in the effectiveness of deep learning models, as it influences their learning capacity, stability, and computational efficiency. In recent years, the Gaussian Error Linear…

Machine Learning · Computer Science 2023-08-02 Minhyeok Lee

Overcoming Overfitting and Large Weight Update Problem in Linear Rectifiers: Thresholded Exponential Rectified Linear Units

In past few years, linear rectified unit activation functions have shown its significance in the neural networks, surpassing the performance of sigmoid activations. RELU (Nair & Hinton, 2010), ELU (Clevert et al., 2015), PRELU (He et al.,…

Machine Learning · Computer Science 2020-06-05 Vijay Pandey

Natural-Logarithm-Rectified Activation Function in Convolutional Neural Networks

Activation functions play a key role in providing remarkable performance in deep neural networks, and the rectified linear unit (ReLU) is one of the most widely used activation functions. Various new activation functions and improvements on…

Machine Learning · Computer Science 2019-08-27 Yang Liu , Jianpeng Zhang , Chao Gao , Jinghua Qu , Lixin Ji

A Methodology for Automatic Selection of Activation Functions to Design Hybrid Deep Neural Networks

Activation functions influence behavior and performance of DNNs. Nonlinear activation functions, like Rectified Linear Units (ReLU), Exponential Linear Units (ELU) and Scaled Exponential Linear Units (SELU), outperform the linear…

Neural and Evolutionary Computing · Computer Science 2019-02-05 Alberto Marchisio , Muhammad Abdullah Hanif , Semeen Rehman , Maurizio Martina , Muhammad Shafique

Stable and Robust Deep Learning By Hyperbolic Tangent Exponential Linear Unit (TeLU)

In this paper, we introduce the Hyperbolic Tangent Exponential Linear Unit (TeLU), a novel neural network activation function, represented as $f(x) = x{\cdot}tanh(e^x)$. TeLU is designed to overcome the limitations of conventional…

Machine Learning · Computer Science 2024-02-06 Alfredo Fernandez , Ankur Mali

Gaussian Error Linear Units (GELUs)

We propose the Gaussian Error Linear Unit (GELU), a high-performing neural network activation function. The GELU activation function is $x\Phi(x)$, where $\Phi(x)$ the standard Gaussian cumulative distribution function. The GELU…

Machine Learning · Computer Science 2023-06-07 Dan Hendrycks , Kevin Gimpel

ReLU$^2$ Wins: Discovering Efficient Activation Functions for Sparse LLMs

Sparse computation offers a compelling solution for the inference of Large Language Models (LLMs) in low-resource scenarios by dynamically skipping the computation of inactive neurons. While traditional approaches focus on ReLU-based LLMs,…

Machine Learning · Computer Science 2024-02-07 Zhengyan Zhang , Yixin Song , Guanghui Yu , Xu Han , Yankai Lin , Chaojun Xiao , Chenyang Song , Zhiyuan Liu , Zeyu Mi , Maosong Sun

SwishReLU: A Unified Approach to Activation Functions for Enhanced Deep Neural Networks Performance

ReLU, a commonly used activation function in deep neural networks, is prone to the issue of "Dying ReLU". Several enhanced versions, such as ELU, SeLU, and Swish, have been introduced and are considered to be less commonly utilized.…

Machine Learning · Computer Science 2024-07-12 Jamshaid Ul Rahman , Rubiqa Zulfiqar , Asad Khan , Nimra

Smooth activations and reproducibility in deep networks

Deep networks are gradually penetrating almost every domain in our lives due to their amazing success. However, with substantive performance accuracy improvements comes the price of \emph{irreproducibility}. Two identical models, trained on…

Machine Learning · Computer Science 2020-12-02 Gil I. Shamir , Dong Lin , Lorenzo Coviello

PLU: The Piecewise Linear Unit Activation Function

Successive linear transforms followed by nonlinear "activation" functions can approximate nonlinear functions to arbitrary precision given sufficient layers. The number of necessary layers is dependent on, in part, by the nature of the…

Neural and Evolutionary Computing · Computer Science 2018-09-26 Andrei Nicolae

Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics

Activation functions are fundamental elements of deep learning architectures as they significantly influence training dynamics. ReLU, while widely used, is prone to the dying neuron problem, which has been mitigated by variants such as…

Machine Learning · Computer Science 2025-05-22 Indrashis Das , Mahmoud Safari , Steven Adriaensen , Frank Hutter

Improved weight initialization for deep and narrow feedforward neural network

Appropriate weight initialization settings, along with the ReLU activation function, have become cornerstones of modern deep learning, enabling the training and deployment of highly effective and efficient neural network models across…

Machine Learning · Computer Science 2024-04-02 Hyunwoo Lee , Yunho Kim , Seung Yeop Yang , Hayoung Choi

Effects of the Nonlinearity in Activation Functions on the Performance of Deep Learning Models

The nonlinearity of activation functions used in deep learning models are crucial for the success of predictive models. There are several commonly used simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU…

Machine Learning · Computer Science 2020-10-16 Nalinda Kulathunga , Nishath Rajiv Ranasinghe , Daniel Vrinceanu , Zackary Kinsman , Lei Huang , Yunjiao Wang

Synaptic Stripping: How Pruning Can Bring Dead Neurons Back To Life

Rectified Linear Units (ReLU) are the default choice for activation functions in deep neural networks. While they demonstrate excellent empirical performance, ReLU activations can fall victim to the dead neuron problem. In these cases, the…

Machine Learning · Computer Science 2023-02-14 Tim Whitaker , Darrell Whitley

Clustering-Based Interpretation of Deep ReLU Network

Amongst others, the adoption of Rectified Linear Units (ReLUs) is regarded as one of the ingredients of the success of deep learning. ReLU activation has been shown to mitigate the vanishing gradient issue, to encourage sparsity in the…

Machine Learning · Statistics 2021-10-14 Nicola Picchiotti , Marco Gori

Deriving Activation Functions Using Integration

Our work proposes a novel approach to designing activation functions by focusing on their gradients and deriving the corresponding activation functions using integration. We introduce the Expanded Integral of the Exponential Linear Unit…

Machine Learning · Computer Science 2025-02-04 Allen Hao Huang , Imanol Schlag