English
Related papers

Related papers: Multilevel Initialization for Layer-Parallel Deep …

200 papers

Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN…

Machine Learning · Computer Science 2019-11-05 Dino Ienco , Roberto Interdonato , Raffaele Gaetano

A common challenge in regression is that for many problems, the degrees of freedom required for a high-quality solution also allows for overfitting. Regularization is a class of strategies that seek to restrict the range of possible…

Machine Learning · Computer Science 2022-11-15 Colin Ponce , Ruipeng Li , Christina Mao , Panayot Vassilevski

We present a new multilevel minimization framework for the training of deep residual networks (ResNets), which has the potential to significantly reduce training time and effort. Our framework is based on the dynamical system's viewpoint,…

Machine Learning · Computer Science 2020-04-15 Lisa Gaedke-Merzhäuser , Alena Kopaničáková , Rolf Krause

Residual neural networks (ResNets) are a promising class of deep neural networks that have shown excellent performance for a number of learning tasks, e.g., image classification and recognition. Mathematically, ResNet architectures can be…

Optimization and Control · Mathematics 2019-07-26 S. Günther , L. Ruthotto , J. B. Schroder , E. C. Cyr , N. R. Gauger

In this work, we propose a multi-stage training strategy for the development of deep learning algorithms applied to problems with multiscale features. Each stage of the pro-posed strategy shares an (almost) identical network structure and…

Numerical Analysis · Mathematics 2020-09-25 Eric Chung , Wing Tat Leung , Sai-Mang Pun , Zecheng Zhang

Factorized layers--operations parameterized by products of two or more matrices--occur in a variety of deep learning contexts, including compressed model training, certain types of knowledge distillation, and multi-head self-attention…

Machine Learning · Statistics 2022-10-07 Mikhail Khodak , Neil Tenenholtz , Lester Mackey , Nicolò Fusi

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error. In such cases, the choice of the optimization algorithm and its respective hyper-parameters introduces…

Machine Learning · Computer Science 2019-12-06 Gauthier Gidel , Francis Bach , Simon Lacoste-Julien

The past few years have witnessed growth in the computational requirements for training deep convolutional neural networks. Current approaches parallelize training onto multiple devices by applying a single parallelization strategy (e.g.,…

Machine Learning · Computer Science 2018-06-12 Zhihao Jia , Sina Lin , Charles R. Qi , Alex Aiken

In this paper, inspired by the multigrid method, we propose a multi-level deep framework for deep solvers. Overall, it divides the entire training process into different levels of training. At each level of training, an adaptive sampling…

Numerical Analysis · Mathematics 2026-02-23 Yu Yang , Qiaolin He

Deep neural networks are capable of modelling highly non-linear functions by capturing different levels of abstraction of data hierarchically. While training deep networks, first the system is initialized near a good optimum by greedy…

Machine Learning · Computer Science 2016-03-10 Anirban Santara , Debapriya Maji , DP Tejas , Pabitra Mitra , Arobinda Gupta

Training a neural network (NN) depends on multiple factors, including but not limited to the initial weights. In this paper, we focus on initializing deep NN parameters such that it performs better, comparing to random or zero…

Machine Learning · Computer Science 2020-11-10 Mohamad H. Danesh

Despite the recent success of stochastic gradient descent in deep learning, it is often difficult to train a deep neural network with an inappropriate choice of its initial parameters. Even if training is successful, it has been known that…

Machine Learning · Computer Science 2023-02-10 Cheolhyoung Lee , Kyunghyun Cho

Neural networks trained via gradient descent with random initialization and without any regularization enjoy good generalization performance in practice despite being highly overparametrized. A promising direction to explain this phenomenon…

Machine Learning · Computer Science 2022-05-17 Hancheng Min , Salma Tarmoun , Rene Vidal , Enrique Mallada

In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary…

Optimization and Control · Mathematics 2020-07-07 Joubine Aghili , Olga Mula

Adversarial training has been shown to regularize deep neural networks in addition to increasing their robustness to adversarial examples. However, its impact on very deep state of the art networks has not been fully investigated. In this…

Computer Vision and Pattern Recognition · Computer Science 2018-05-30 Swami Sankaranarayanan , Arpit Jain , Rama Chellappa , Ser Nam Lim

To theoretically understand the behavior of trained deep neural networks, it is necessary to study the dynamics induced by gradient methods from a random initialization. However, the nonlinear and compositional structure of these models…

Machine Learning · Computer Science 2021-12-21 Karl Hajjar , Lénaïc Chizat , Christophe Giraud

Training neural networks with first order optimisation methods is at the core of the empirical success of deep learning. The scale of initialisation is a crucial factor, as small initialisations are generally associated to a feature…

Machine Learning · Computer Science 2025-09-16 Etienne Boursier , Nicolas Flammarion

We construct custom regularization functions for use in supervised training of deep neural networks. Our technique is applicable when the ground-truth labels themselves exhibit internal structure; we derive a regularizer by learning an…

Computer Vision and Pattern Recognition · Computer Science 2018-04-09 Mohammadreza Mostajabi , Michael Maire , Gregory Shakhnarovich

Despite the remarkable success of deep learning in pattern recognition, deep network models face the problem of training a large number of parameters. In this paper, we propose and evaluate a novel multi-path wavelet neural network…

Computer Vision and Pattern Recognition · Computer Science 2019-08-27 D. D. N. De Silva , H. W. M. K. Vithanage , K. S. D. Fernando , I. T. S. Piyatilake

Research aimed at scaling up neuroscience inspired learning algorithms for neural networks is accelerating. Recently, a key research area has been the study of energy-based learning algorithms such as predictive coding, due to their…

Machine Learning · Computer Science 2026-01-30 Luca Pinchetti , Simon Frieder , Thomas Lukasiewicz , Tommaso Salvatori
‹ Prev 1 2 3 10 Next ›