English

Net2Net: Accelerating Learning via Knowledge Transfer

Machine Learning 2016-04-26 v4

Abstract

We introduce techniques for rapidly transferring the information stored in one neural net into another neural net. The main purpose is to accelerate the training of a significantly larger neural net. During real-world workflows, one often trains very many different neural networks during the experimentation and design process. This is a wasteful process in which each new model is trained from scratch. Our Net2Net technique accelerates the experimentation process by instantaneously transferring the knowledge from a previous network to each new deeper or wider network. Our techniques are based on the concept of function-preserving transformations between neural network specifications. This differs from previous approaches to pre-training that altered the function represented by a neural net when adding layers to it. Using our knowledge transfer mechanism to add depth to Inception modules, we demonstrate a new state of the art accuracy rating on the ImageNet dataset.

Keywords

Cite

@article{arxiv.1511.05641,
  title  = {Net2Net: Accelerating Learning via Knowledge Transfer},
  author = {Tianqi Chen and Ian Goodfellow and Jonathon Shlens},
  journal= {arXiv preprint arXiv:1511.05641},
  year   = {2016}
}

Comments

ICLR 2016 submission

R2 v1 2026-06-22T11:48:03.045Z