English
Related papers

Related papers: Evolving Normalization-Activation Layers

200 papers

In many information processing systems, it may be desirable to ensure that any change of the input, whether by shifting or scaling, results in a corresponding change in the system response. While deep neural networks are gradually replacing…

Computer Vision and Pattern Recognition · Computer Science 2024-02-22 Sébastien Herbreteau , Emmanuel Moebel , Charles Kervrann

The success of deep learning is inseparable from normalization layers. Researchers have proposed various normalization functions, and each of them has both advantages and disadvantages. In response, efforts have been made to design a…

Machine Learning · Computer Science 2024-02-20 Zikai Zhou , Shuo Zhang , Ziruo Wang , Huanran Chen

Normalization techniques have only recently begun to be exploited in supervised learning tasks. Batch normalization exploits mini-batch statistics to normalize the activations. This was shown to speed up training and result in better…

Machine Learning · Computer Science 2017-03-08 Mengye Ren , Renjie Liao , Raquel Urtasun , Fabian H. Sinz , Richard S. Zemel

Normalization techniques have become a basic component in modern convolutional neural networks (ConvNets). In particular, many recent works demonstrate that promoting the orthogonality of the weights helps train deep models and improve…

Computer Vision and Pattern Recognition · Computer Science 2022-01-05 Sheng Liu , Xiao Li , Yuexiang Zhai , Chong You , Zhihui Zhu , Carlos Fernandez-Granda , Qing Qu

The hyper-parameters of a neural network are traditionally designed through a time consuming process of trial and error that requires substantial expert knowledge. Neural Architecture Search (NAS) algorithms aim to take the human out of the…

Neural and Evolutionary Computing · Computer Science 2021-06-01 Andrew Nader , Danielle Azar

Convolutional Neural Networks (CNNs) have been widely applied. But as the CNNs grow, the number of arithmetic operations and memory footprint also increase. Furthermore, typical non-linear activation functions do not allow associativity of…

Machine Learning · Computer Science 2021-11-10 Eduardo Vera Sousa , Leandro A. F. Fernandes , Cristina Nader Vasconcelos

A technical note aiming to offer deeper intuition for the LayerNorm function common in deep neural networks. LayerNorm is defined relative to a distinguished 'neural' basis, but it does more than just normalize the corresponding vector…

Machine Learning · Computer Science 2024-05-08 Paul M. Riechers

Despite the increasing prevalence of deep neural networks, their applicability in resource-constrained devices is limited due to their computational load. While modern devices exhibit a high level of parallelism, real-time latency is still…

Computer Vision and Pattern Recognition · Computer Science 2021-09-06 Amir Ben Dror , Niv Zehngut , Avraham Raviv , Evgeny Artyomov , Ran Vitek , Roy Jevnisek

A popular method to reduce the training time of deep neural networks is to normalize activations at each layer. Although various normalization schemes have been proposed, they all follow a common theme: normalize across spatial dimensions…

Computer Vision and Pattern Recognition · Computer Science 2019-12-20 Boyi Li , Felix Wu , Kilian Q. Weinberger , Serge Belongie

Inspired by BatchNorm, there has been an explosion of normalization layers in deep learning. Recent works have identified a multitude of beneficial properties in BatchNorm to explain its success. However, given the pursuit of alternative…

Machine Learning · Computer Science 2021-10-27 Ekdeep Singh Lubana , Robert P. Dick , Hidenori Tanaka

Deep feedforward neural networks with piecewise linear activations are currently producing the state-of-the-art results in several public datasets. The combination of deep learning models and piecewise linear activation functions allows for…

Computer Vision and Pattern Recognition · Computer Science 2015-11-03 Zhibin Liao , Gustavo Carneiro

Recent studies revealed that convolutional neural networks do not generalize well to small image transformations, e.g. rotations by a few degrees or translations of a few pixels. To improve the robustness to such transformations, we propose…

Computer Vision and Pattern Recognition · Computer Science 2022-08-23 Adrian Sandru , Mariana-Iuliana Georgescu , Radu Tudor Ionescu

Activation functions play a decisive role in determining the capacity of Deep Neural Networks as they enable neural networks to capture inherent nonlinearities present in data fed to them. The prior research on activation functions…

Computer Vision and Pattern Recognition · Computer Science 2023-05-31 Jamshaid Ul Rahman , Faiza Makhdoom , Dianchen Lu

Artificial neural networks (ANN), typically referred to as neural networks, are a class of Machine Learning algorithms and have achieved widespread success, having been inspired by the biological structure of the human brain. Neural…

Machine Learning · Computer Science 2022-04-08 Murilo Gustineli

We develop a new method for regularising neural networks. We learn a probability distribution over the activations of all layers of the model and then insert imputed values into the network during training. We obtain a posterior for an…

Machine Learning · Computer Science 2019-10-14 Matthew Willetts , Alexander Camuto , Stephen Roberts , Chris Holmes

Activation functions (AFs) play a pivotal role in the performance of neural networks. The Rectified Linear Unit (ReLU) is currently the most commonly used AF. Several replacements to ReLU have been suggested but improvements have proven…

Neural and Evolutionary Computing · Computer Science 2022-06-27 Raz Lapid , Moshe Sipper

Deep neural networks are often used to implement powerful generative models for real-world data. Notable applications include image denoising, as well as other classical inverse problems like compressed sensing and super-resolution. To…

Machine Learning · Computer Science 2026-02-23 Ruhui Jin , Dustin G. Mixon , Soledad Villar

Subsampling layers play a crucial role in deep nets by discarding a portion of an activation map to reduce its spatial dimensions. This encourages the deep net to learn higher-level representations. Contrary to this motivation, we…

Computer Vision and Pattern Recognition · Computer Science 2024-10-03 Chiao-An Yang , Ziwei Liu , Raymond A. Yeh

Layer normalization (LN) is a fundamental component in modern deep learning, but its per-sample centering and scaling introduce non-negligible inference overhead. RMSNorm improves efficiency by removing the centering operation, yet this may…

Machine Learning · Computer Science 2026-05-15 Yuxin Guo , Yihao Yue , Yunhao Ni , Yizhou Ruan , Jie Luo , Wenjun Wu , Lei Huang

Many activation functions have been proposed in the past, but selecting an adequate one requires trial and error. We propose a new methodology of designing activation functions within a neural network at each layer. We call this technique…

Machine Learning · Statistics 2017-02-28 Mark Harmon , Diego Klabjan
‹ Prev 1 2 3 10 Next ›