Related papers: Scale Normalization

Scale-Regularized Filter Learning

We start out by demonstrating that an elementary learning task, corresponding to the training of a single linear neuron in a convolutional neural network, can be solved for feature spaces of very high dimensionality. In a second step,…

Computer Vision and Pattern Recognition · Computer Science 2017-07-11 Marco Loog , François Lauze

Dynamical Isometry: The Missing Ingredient for Neural Network Pruning

Several recent works [40, 24] observed an interesting phenomenon in neural network pruning: A larger finetuning learning rate can improve the final performance significantly. Unfortunately, the reason behind it remains elusive up to date.…

Machine Learning · Computer Science 2021-05-14 Huan Wang , Can Qin , Yue Bai , Yun Fu

Scaling and Resizing Symmetry in Feedforward Networks

Weights initialization in deep neural networks have a strong impact on the speed of converge of the learning map. Recent studies have shown that in the case of random initializations, a chaos/order phase transition occur in the space of…

Machine Learning · Computer Science 2023-06-28 Carlos Cardona

Scaling Laws For Deep Learning Based Image Reconstruction

Deep neural networks trained end-to-end to map a measurement of a (noisy) image to a clean image perform excellent for a variety of linear inverse problems. Current methods are only trained on a few hundreds or thousands of images as…

Image and Video Processing · Electrical Eng. & Systems 2023-02-24 Tobit Klug , Reinhard Heckel

On the Importance of Consistency in Training Deep Neural Networks

We explain that the difficulties of training deep neural networks come from a syndrome of three consistency issues. This paper describes our efforts in their analysis and treatment. The first issue is the training speed inconsistency in…

Machine Learning · Computer Science 2017-08-03 Chengxi Ye , Yezhou Yang , Cornelia Fermuller , Yiannis Aloimonos

Return-based Scaling: Yet Another Normalisation Trick for Deep RL

Scaling issues are mundane yet irritating for practitioners of reinforcement learning. Error scales vary across domains, tasks, and stages of learning; sometimes by many orders of magnitude. This can be detrimental to learning speed and…

Machine Learning · Computer Science 2021-05-13 Tom Schaul , Georg Ostrovski , Iurii Kemaev , Diana Borsa

With Greater Distance Comes Worse Performance: On the Perspective of Layer Utilization and Model Generalization

Generalization of deep neural networks remains one of the main open problems in machine learning. Previous theoretical works focused on deriving tight bounds of model complexity, while empirical works revealed that neural networks exhibit…

Machine Learning · Computer Science 2022-01-31 James Wang , Cheng-Lin Yang

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Scaling deep reinforcement learning networks is challenging and often results in degraded performance, yet the root causes of this failure mode remain poorly understood. Several recent works have proposed mechanisms to address this, but…

Machine Learning · Computer Science 2026-02-03 Roger Creus Castanyer , Johan Obando-Ceron , Lu Li , Pierre-Luc Bacon , Glen Berseth , Aaron Courville , Pablo Samuel Castro

Scaling ResNets in the Large-depth Regime

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid…

Machine Learning · Computer Science 2025-03-04 Pierre Marion , Adeline Fermanian , Gérard Biau , Jean-Philippe Vert

Learning in Compact Spaces with Approximately Normalized Transformer

The successful training of deep neural networks requires addressing challenges such as overfitting, numerical instabilities leading to divergence, and increasing variance in the residual stream. A common solution is to apply regularization…

Machine Learning · Computer Science 2025-11-20 Jörg K. H. Franke , Urs Spiegelhalter , Marianna Nezhurina , Jenia Jitsev , Frank Hutter , Michael Hefenbrock

Accelerating Training of Deep Neural Networks with a Standardization Loss

A significant advance in accelerating neural network training has been the development of normalization methods, permitting the training of deep models both faster and with better accuracy. These advances come with practical challenges: for…

Machine Learning · Computer Science 2019-03-05 Jasmine Collins , Johannes Balle , Jonathon Shlens

Adaptive Scaling

Preprocessing data is an important step before any data analysis. In this paper, we focus on one particular aspect, namely scaling or normalization. We analyze various scaling methods in common use and study their effects on different…

Machine Learning · Statistics 2017-09-05 Ting Li , Bingyi Jing , Ningchen Ying , Xianshi Yu

Dynamic Neural Diversification: Path to Computationally Sustainable Neural Networks

Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks, where now excessively large models are used. However, such models face several problems during the…

Machine Learning · Computer Science 2021-09-21 Alexander Kovalenko , Pavel Kordík , Magda Friedjungová

Training Deep Neural Networks Without Batch Normalization

Training neural networks is an optimization problem, and finding a decent set of parameters through gradient descent can be a difficult task. A host of techniques has been developed to aid this process before and during the training phase.…

Machine Learning · Computer Science 2020-08-19 Divya Gaur , Joachim Folz , Andreas Dengel

Unified Neural Network Scaling Laws and Scale-time Equivalence

As neural networks continue to grow in size but datasets might not, it is vital to understand how much performance improvement can be expected: is it more important to scale network size or data volume? Thus, neural network scaling laws,…

Machine Learning · Computer Science 2024-09-10 Akhilan Boopathy , Ila Fiete

Fixup Initialization: Residual Learning Without Normalization

Normalization layers are a staple in state-of-the-art deep neural network architectures. They are widely believed to stabilize training, enable higher learning rate, accelerate convergence and improve generalization, though the reason for…

Machine Learning · Computer Science 2019-03-13 Hongyi Zhang , Yann N. Dauphin , Tengyu Ma

New Interpretations of Normalization Methods in Deep Learning

In recent years, a variety of normalization methods have been proposed to help train neural networks, such as batch normalization (BN), layer normalization (LN), weight normalization (WN), group normalization (GN), etc. However,…

Machine Learning · Computer Science 2020-06-17 Jiacheng Sun , Xiangyong Cao , Hanwen Liang , Weiran Huang , Zewei Chen , Zhenguo Li

Diagonal Rescaling For Neural Networks

We define a second-order neural network stochastic gradient training algorithm whose block-diagonal structure effectively amounts to normalizing the unit activations. Investigating why this algorithm lacks in robustness then reveals two…

Machine Learning · Computer Science 2017-05-29 Jean Lafond , Nicolas Vasilache , Léon Bottou

Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures

Conventional scaling of neural networks typically involves designing a base network and growing different dimensions like width, depth, etc. of the same by some predefined scaling factors. We introduce an automated scaling approach…

Machine Learning · Computer Science 2024-02-21 Akash Guna R. T , Arnav Chavan , Deepak Gupta

Normalization and effective learning rates in reinforcement learning

Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature, with several works highlighting diverse benefits such as improving loss landscape conditioning and combatting…

Machine Learning · Computer Science 2024-07-03 Clare Lyle , Zeyu Zheng , Khimya Khetarpal , James Martens , Hado van Hasselt , Razvan Pascanu , Will Dabney