Related papers: Data-driven Weight Initialization with Sylvester S…

Reducing Neural Network Parameter Initialization Into an SMT Problem

Training a neural network (NN) depends on multiple factors, including but not limited to the initial weights. In this paper, we focus on initializing deep NN parameters such that it performs better, comparing to random or zero…

Machine Learning · Computer Science 2020-11-10 Mohamad H. Danesh

A Good Start Matters: Enhancing Continual Learning with Data-Driven Weight Initialization

To adapt to real-world data streams, continual learning (CL) systems must rapidly learn new concepts while preserving and utilizing prior knowledge. When it comes to adding new information to continually-trained deep neural networks (DNNs),…

Machine Learning · Computer Science 2025-07-02 Md Yousuf Harun , Christopher Kanan

On the Initialization of Long Short-Term Memory Networks

Weight initialization is important for faster convergence and stability of deep neural networks training. In this paper, a robust initialization method is developed to address the training instability in long short-term memory (LSTM)…

Machine Learning · Computer Science 2019-12-24 Mostafa Mehdipour Ghazi , Mads Nielsen , Akshay Pai , Marc Modat , M. Jorge Cardoso , Sebastien Ourselin , Lauge Sorensen

Weight Initialization of Deep Neural Networks(DNNs) using Data Statistics

Deep neural networks (DNNs) form the backbone of almost every state-of-the-art technique in the fields such as computer vision, speech processing, and text analysis. The recent advances in computational technology have made the use of DNNs…

Machine Learning · Computer Science 2018-03-20 Saiprasad Koturwar , Shabbir Merchant

Multilevel Initialization for Layer-Parallel Deep Neural Network Training

This paper investigates multilevel initialization strategies for training very deep neural networks with a layer-parallel multigrid solver. The scheme is based on the continuous interpretation of the training problem as a problem of optimal…

Machine Learning · Computer Science 2019-12-20 Eric C. Cyr , Stefanie Günther , Jacob B. Schroder

An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification

The growing interest in satellite imagery has triggered the need for efficient mechanisms to extract valuable information from these vast data sources, providing deeper insights. Even though deep learning has shown significant progress in…

Computer Vision and Pattern Recognition · Computer Science 2024-06-04 Wadii Boulila , Eman Alshanqiti , Ayyub Alzahem , Anis Koubaa , Nabil Mlaiki

Target noise: A pre-training based neural network initialization for efficient high resolution learning

Weight initialization plays a crucial role in the optimization behavior and convergence efficiency of neural networks. Most existing initialization methods, such as Xavier and Kaiming initializations, rely on random sampling and do not…

Machine Learning · Computer Science 2026-02-09 Shaowen Wang , Tariq Alkhalifah

Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

Despite the recent success of stochastic gradient descent in deep learning, it is often difficult to train a deep neural network with an inappropriate choice of its initial parameters. Even if training is successful, it has been known that…

Machine Learning · Computer Science 2023-02-10 Cheolhyoung Lee , Kyunghyun Cho

A Weight Initialization Based on the Linear Product Structure for Neural Networks

Weight initialization plays an important role in training neural networks and also affects tremendous deep learning applications. Various weight initialization strategies have already been developed for different activation functions with…

Machine Learning · Computer Science 2022-08-09 Qipin Chen , Wenrui Hao , Juncai He

Initialization of ReLUs for Dynamical Isometry

Deep learning relies on good initialization schemes and hyperparameter choices prior to training a neural network. Random weight initializations induce random network ensembles, which give rise to the trainability, training speed, and…

Machine Learning · Statistics 2019-10-25 Rebekka Burkholz , Alina Dubatovka

Initializing Models with Larger Ones

Weight initialization plays an important role in neural network training. Widely used initialization methods are proposed and evaluated for networks that are trained from scratch. However, the growing number of pretrained models now offers…

Machine Learning · Computer Science 2023-12-01 Zhiqiu Xu , Yanjie Chen , Kirill Vishniakov , Yida Yin , Zhiqiang Shen , Trevor Darrell , Lingjie Liu , Zhuang Liu

Revisiting Initialization of Neural Networks

The proper initialization of weights is crucial for the effective training and fast convergence of deep neural networks (DNNs). Prior work in this area has mostly focused on balancing the variance among weights per layer to maintain…

Machine Learning · Computer Science 2020-06-05 Maciej Skorski , Alessandro Temperoni , Martin Theobald

Supervised level-wise pretraining for recurrent neural network initialization in multi-class classification

Recurrent Neural Networks (RNNs) can be seriously impacted by the initial parameters assignment, which may result in poor generalization performances on new unseen data. With the objective to tackle this crucial issue, in the context of RNN…

Machine Learning · Computer Science 2019-11-05 Dino Ienco , Roberto Interdonato , Raffaele Gaetano

How to Initialize your Network? Robust Initialization for WeightNorm & ResNets

Residual networks (ResNet) and weight normalization play an important role in various deep learning applications. However, parameter initialization strategies have not been studied previously for weight normalized networks and, in practice,…

Machine Learning · Statistics 2019-10-31 Devansh Arpit , Victor Campos , Yoshua Bengio

Good Initializations of Variational Bayes for Deep Models

Stochastic variational inference is an established way to carry out approximate Bayesian inference for deep models. While there have been effective proposals for good initializations for loss minimization in deep learning, far less…

Machine Learning · Statistics 2019-01-28 Simone Rossi , Pietro Michiardi , Maurizio Filippone

Nonparametric Weight Initialization of Neural Networks via Integral Representation

A new initialization method for hidden parameters in a neural network is proposed. Derived from the integral representation of the neural network, a nonparametric probability distribution of hidden parameters is introduced. In this…

Machine Learning · Computer Science 2014-02-20 Sho Sonoda , Noboru Murata

Data-dependent Initializations of Convolutional Neural Networks

Convolutional Neural Networks spread through computer vision like a wildfire, impacting almost all visual tasks imaginable. Despite this, few researchers dare to train their models from scratch. Most work builds on one of a handful of…

Computer Vision and Pattern Recognition · Computer Science 2016-09-26 Philipp Krähenbühl , Carl Doersch , Jeff Donahue , Trevor Darrell

Historical Document Image Segmentation with LDA-Initialized Deep Neural Networks

In this paper, we present a novel approach to perform deep neural networks layer-wise weight initialization using Linear Discriminant Analysis (LDA). Typically, the weights of a deep neural network are initialized with: random values,…

Computer Vision and Pattern Recognition · Computer Science 2017-11-27 Michele Alberti , Mathias Seuret , Vinaychandran Pondenkandath , Rolf Ingold , Marcus Liwicki

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks…

Machine Learning · Computer Science 2020-06-26 Zebin Yang , Hengtao Zhang , Agus Sudjianto , Aijun Zhang

Revisit Multinomial Logistic Regression in Deep Learning: Data Dependent Model Initialization for Image Recognition

We study in this paper how to initialize the parameters of multinomial logistic regression (a fully connected layer followed with softmax and cross entropy loss), which is widely used in deep neural network (DNN) models for classification…

Computer Vision and Pattern Recognition · Computer Science 2018-09-18 Bowen Cheng , Rong Xiao , Yandong Guo , Yuxiao Hu , Jianfeng Wang , Lei Zhang