Related papers: Stable ResNet

Forward Stability of ResNet and Its Variants

The residual neural network (ResNet) is a popular deep network architecture which has the ability to obtain high-accuracy results on several image processing problems. In order to analyze the behavior and structure of ResNet, recent work…

Computer Vision and Pattern Recognition · Computer Science 2018-11-27 Linan Zhang , Hayden Schaeffer

Stable Architectures for Deep Neural Networks

Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. While often offering superior results over traditional techniques and successfully expressing complicated patterns in…

Machine Learning · Computer Science 2019-02-19 Eldad Haber , Lars Ruthotto

GloNets: Globally Connected Neural Networks

Deep learning architectures suffer from depth-related performance degradation, limiting the effective depth of neural networks. Approaches like ResNet are able to mitigate this, but they do not completely eliminate the problem. We introduce…

Machine Learning · Computer Science 2023-11-28 Antonio Di Cecco , Carlo Metta , Marco Fantozzi , Francesco Morandin , Maurizio Parton

The exploding gradient problem demystified - definition, prevalence, impact, origin, tradeoffs, and solutions

Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities "solve" the exploding gradient problem, we show that this is not the case in general and that in a range of popular MLP…

Machine Learning · Computer Science 2018-04-10 George Philipp , Dawn Song , Jaime G. Carbonell

Scaling ResNets in the Large-depth Regime

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid…

Machine Learning · Computer Science 2025-03-04 Pierre Marion , Adeline Fermanian , Gérard Biau , Jean-Philippe Vert

Are ResNets Provably Better than Linear Predictors?

A residual network (or ResNet) is a standard deep neural net architecture, with state-of-the-art performance across numerous applications. The main premise of ResNets is that they allow the training of each layer to focus on fitting just…

Machine Learning · Computer Science 2018-09-28 Ohad Shamir

The Shattered Gradients Problem: If resnets are the answer, then what is the question?

A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. Although, the problem has largely been overcome via carefully constructed initializations and batch normalization, architectures…

Neural and Evolutionary Computing · Computer Science 2018-06-07 David Balduzzi , Marcus Frean , Lennox Leary , JP Lewis , Kurt Wan-Duo Ma , Brian McWilliams

Residual Networks: Lyapunov Stability and Convex Decomposition

While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception. We show that the main reason for this is the Lyapunov stability of the gradient descent…

Machine Learning · Computer Science 2018-03-23 Kamil Nar , Shankar Sastry

Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning

Scaling deep reinforcement learning networks is challenging and often results in degraded performance, yet the root causes of this failure mode remain poorly understood. Several recent works have proposed mechanisms to address this, but…

Machine Learning · Computer Science 2026-02-03 Roger Creus Castanyer , Johan Obando-Ceron , Lu Li , Pierre-Luc Bacon , Glen Berseth , Aaron Courville , Pablo Samuel Castro

Convergence Analysis of Deep Residual Networks

Various powerful deep neural network architectures have made great contribution to the exciting successes of deep learning in the past two decades. Among them, deep Residual Networks (ResNets) are of particular importance because they…

Machine Learning · Computer Science 2022-05-16 Wentao Huang , Haizhang Zhang

Identity Connections in Residual Nets Improve Noise Stability

Residual Neural Networks (ResNets) achieve state-of-the-art performance in many computer vision problems. Compared to plain networks without residual connections (PlnNets), ResNets train faster, generalize better, and suffer less from the…

Machine Learning · Computer Science 2019-05-28 Shuzhi Yu , Carlo Tomasi

Learning Strict Identity Mappings in Deep Residual Networks

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation. The ability to train very deep…

Computer Vision and Pattern Recognition · Computer Science 2019-06-18 Xin Yu , Zhiding Yu , Srikumar Ramalingam

Stable Recurrent Models

Stability is a fundamental property of dynamical systems, yet to this date it has had little bearing on the practice of recurrent neural networks. In this work, we conduct a thorough investigation of stable recurrent models. Theoretically,…

Machine Learning · Computer Science 2019-03-05 John Miller , Moritz Hardt

Recurrent neural networks: vanishing and exploding gradients are not the end of the story

Recurrent neural networks (RNNs) notoriously struggle to learn long-term memories, primarily due to vanishing and exploding gradients. The recent success of state-space models (SSMs), a subclass of RNNs, to overcome such difficulties…

Machine Learning · Computer Science 2024-11-06 Nicolas Zucchet , Antonio Orvieto

Optimal Depth of Neural Networks

Determining the optimal depth of a neural network is a fundamental yet challenging problem, typically resolved through resource-intensive experimentation. This paper introduces a formal theoretical framework to address this question by…

Machine Learning · Computer Science 2025-06-23 Qian Qi

Norm-Preservation: Why Residual Networks Can Become Extremely Deep?

Augmenting neural networks with skip connections, as introduced in the so-called ResNet architecture, surprised the community by enabling the training of networks of more than 1,000 layers with significant performance gains. This paper…

Computer Vision and Pattern Recognition · Computer Science 2020-04-24 Alireza Zaeemzadeh , Nazanin Rahnavard , Mubarak Shah

On residual network depth

Deep residual architectures, such as ResNet and the Transformer, have enabled models of unprecedented depth, yet a formal understanding of why depth is so effective remains an open question. A popular intuition, following Veit et al.…

Machine Learning · Computer Science 2025-10-09 Benoit Dherin , Michael Munn

Volume-preserving Neural Networks

We propose a novel approach to addressing the vanishing (or exploding) gradient problem in deep neural networks. We construct a new architecture for deep neural networks where all layers (except the output layer) of the network are a…

Machine Learning · Computer Science 2021-04-27 Gordon MacDonald , Andrew Godbout , Bryn Gillcash , Stephanie Cairns

Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness

The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle,…

Machine Learning · Computer Science 2020-03-06 Antônio H. Ribeiro , Koen Tiels , Luis A. Aguirre , Thomas B. Schön

Deep Residual Networks and Weight Initialization

Residual Network (ResNet) is the state-of-the-art architecture that realizes successful training of really deep neural network. It is also known that good weight initialization of neural network avoids problem of vanishing/exploding…

Machine Learning · Computer Science 2017-10-16 Masato Taki