Data-driven Weight Initialization with Sylvester Solvers

Debasmit Das; Yash Bhalgat; Fatih Porikli

Data-driven Weight Initialization with Sylvester Solvers

Neural and Evolutionary Computing 2021-05-24 v1 Computer Vision and Pattern Recognition Machine Learning

Authors: Debasmit Das , Yash Bhalgat , Fatih Porikli

Abstract

In this work, we propose a data-driven scheme to initialize the parameters of a deep neural network. This is in contrast to traditional approaches which randomly initialize parameters by sampling from transformed standard distributions. Such methods do not use the training data to produce a more informed initialization. Our method uses a sequential layer-wise approach where each layer is initialized using its input activations. The initialization is cast as an optimization problem where we minimize a combination of encoding and decoding losses of the input activations, which is further constrained by a user-defined latent code. The optimization problem is then restructured into the well-known Sylvester equation, which has fast and efficient gradient-free solutions. Our data-driven method achieves a boost in performance compared to random initialization methods, both before start of training and after training is over. We show that our proposed method is especially effective in few-shot and fine-tuning settings. We conclude this paper with analyses on time complexity and the effect of different latent codes on the recognition performance.

Keywords

neural network training deep learning randomized algorithm

Cite

@article{arxiv.2105.10335,
  title  = {Data-driven Weight Initialization with Sylvester Solvers},
  author = {Debasmit Das and Yash Bhalgat and Fatih Porikli},
  journal= {arXiv preprint arXiv:2105.10335},
  year   = {2021}
}

Comments

Practical Machine Learning for Developing Countries Workshop, International Conference on Learning Representations, 2021

Data-driven Weight Initialization with Sylvester Solvers

Abstract

Keywords

Cite

Comments

Related papers