English

Principled Deep Neural Network Training through Linear Programming

Machine Learning 2022-03-03 v3 Optimization and Control Machine Learning

Abstract

Deep learning has received much attention lately due to the impressive empirical performance achieved by training algorithms. Consequently, a need for a better theoretical understanding of these problems has become more evident in recent years. In this work, using a unified framework, we show that there exists a polyhedron which encodes simultaneously all possible deep neural network training problems that can arise from a given architecture, activation functions, loss function, and sample-size. Notably, the size of the polyhedral representation depends only linearly on the sample-size, and a better dependency on several other network parameters is unlikely (assuming PNPP\neq NP). Additionally, we use our polyhedral representation to obtain new and better computational complexity results for training problems of well-known neural network architectures. Our results provide a new perspective on training problems through the lens of polyhedral theory and reveal a strong structure arising from these problems.

Keywords

Cite

@article{arxiv.1810.03218,
  title  = {Principled Deep Neural Network Training through Linear Programming},
  author = {Daniel Bienstock and Gonzalo Muñoz and Sebastian Pokutta},
  journal= {arXiv preprint arXiv:1810.03218},
  year   = {2022}
}