English
Related papers

Related papers: Stable ResNet

200 papers

The residual neural network (ResNet) is a popular deep network architecture which has the ability to obtain high-accuracy results on several image processing problems. In order to analyze the behavior and structure of ResNet, recent work…

Computer Vision and Pattern Recognition · Computer Science 2018-11-27 Linan Zhang , Hayden Schaeffer

Deep neural networks have become invaluable tools for supervised machine learning, e.g., classification of text or images. While often offering superior results over traditional techniques and successfully expressing complicated patterns in…

Machine Learning · Computer Science 2019-02-19 Eldad Haber , Lars Ruthotto

Deep learning architectures suffer from depth-related performance degradation, limiting the effective depth of neural networks. Approaches like ResNet are able to mitigate this, but they do not completely eliminate the problem. We introduce…

Machine Learning · Computer Science 2023-11-28 Antonio Di Cecco , Carlo Metta , Marco Fantozzi , Francesco Morandin , Maurizio Parton

Whereas it is believed that techniques such as Adam, batch normalization and, more recently, SeLU nonlinearities "solve" the exploding gradient problem, we show that this is not the case in general and that in a range of popular MLP…

Machine Learning · Computer Science 2018-04-10 George Philipp , Dawn Song , Jaime G. Carbonell

Deep ResNets are recognized for achieving state-of-the-art results in complex machine learning tasks. However, the remarkable performance of these architectures relies on a training procedure that needs to be carefully crafted to avoid…

Machine Learning · Computer Science 2025-03-04 Pierre Marion , Adeline Fermanian , Gérard Biau , Jean-Philippe Vert

A residual network (or ResNet) is a standard deep neural net architecture, with state-of-the-art performance across numerous applications. The main premise of ResNets is that they allow the training of each layer to focus on fitting just…

Machine Learning · Computer Science 2018-09-28 Ohad Shamir

A long-standing obstacle to progress in deep learning is the problem of vanishing and exploding gradients. Although, the problem has largely been overcome via carefully constructed initializations and batch normalization, architectures…

Neural and Evolutionary Computing · Computer Science 2018-06-07 David Balduzzi , Marcus Frean , Lennox Leary , JP Lewis , Kurt Wan-Duo Ma , Brian McWilliams

While training error of most deep neural networks degrades as the depth of the network increases, residual networks appear to be an exception. We show that the main reason for this is the Lyapunov stability of the gradient descent…

Machine Learning · Computer Science 2018-03-23 Kamil Nar , Shankar Sastry

Scaling deep reinforcement learning networks is challenging and often results in degraded performance, yet the root causes of this failure mode remain poorly understood. Several recent works have proposed mechanisms to address this, but…

Various powerful deep neural network architectures have made great contribution to the exciting successes of deep learning in the past two decades. Among them, deep Residual Networks (ResNets) are of particular importance because they…

Machine Learning · Computer Science 2022-05-16 Wentao Huang , Haizhang Zhang

Residual Neural Networks (ResNets) achieve state-of-the-art performance in many computer vision problems. Compared to plain networks without residual connections (PlnNets), ResNets train faster, generalize better, and suffer less from the…

Machine Learning · Computer Science 2019-05-28 Shuzhi Yu , Carlo Tomasi

A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation. The ability to train very deep…

Computer Vision and Pattern Recognition · Computer Science 2019-06-18 Xin Yu , Zhiding Yu , Srikumar Ramalingam

Stability is a fundamental property of dynamical systems, yet to this date it has had little bearing on the practice of recurrent neural networks. In this work, we conduct a thorough investigation of stable recurrent models. Theoretically,…

Machine Learning · Computer Science 2019-03-05 John Miller , Moritz Hardt

Recurrent neural networks (RNNs) notoriously struggle to learn long-term memories, primarily due to vanishing and exploding gradients. The recent success of state-space models (SSMs), a subclass of RNNs, to overcome such difficulties…

Machine Learning · Computer Science 2024-11-06 Nicolas Zucchet , Antonio Orvieto

Determining the optimal depth of a neural network is a fundamental yet challenging problem, typically resolved through resource-intensive experimentation. This paper introduces a formal theoretical framework to address this question by…

Machine Learning · Computer Science 2025-06-23 Qian Qi

Augmenting neural networks with skip connections, as introduced in the so-called ResNet architecture, surprised the community by enabling the training of networks of more than 1,000 layers with significant performance gains. This paper…

Computer Vision and Pattern Recognition · Computer Science 2020-04-24 Alireza Zaeemzadeh , Nazanin Rahnavard , Mubarak Shah

Deep residual architectures, such as ResNet and the Transformer, have enabled models of unprecedented depth, yet a formal understanding of why depth is so effective remains an open question. A popular intuition, following Veit et al.…

Machine Learning · Computer Science 2025-10-09 Benoit Dherin , Michael Munn

We propose a novel approach to addressing the vanishing (or exploding) gradient problem in deep neural networks. We construct a new architecture for deep neural networks where all layers (except the output layer) of the network are a…

Machine Learning · Computer Science 2021-04-27 Gordon MacDonald , Andrew Godbout , Bryn Gillcash , Stephanie Cairns

The exploding and vanishing gradient problem has been the major conceptual principle behind most architecture and training improvements in recurrent neural networks (RNNs) during the last decade. In this paper, we argue that this principle,…

Machine Learning · Computer Science 2020-03-06 Antônio H. Ribeiro , Koen Tiels , Luis A. Aguirre , Thomas B. Schön

Residual Network (ResNet) is the state-of-the-art architecture that realizes successful training of really deep neural network. It is also known that good weight initialization of neural network avoids problem of vanishing/exploding…

Machine Learning · Computer Science 2017-10-16 Masato Taki
‹ Prev 1 2 3 10 Next ›