English

An Approximation Algorithm for Optimal Subarchitecture Extraction

Machine Learning 2020-10-19 v1 Data Structures and Algorithms

Abstract

We consider the problem of finding the set of architectural parameters for a chosen deep neural network which is optimal under three metrics: parameter size, inference speed, and error rate. In this paper we state the problem formally, and present an approximation algorithm that, for a large subset of instances behaves like an FPTAS with an approximation error of ρ1ϵ\rho \leq |{1- \epsilon}|, and that runs in O(Ξ+WT(1+ΘBΞ/(ϵs3/2)))O(|{\Xi}| + |{W^*_T}|(1 + |{\Theta}||{B}||{\Xi}|/({\epsilon\, s^{3/2})})) steps, where ϵ\epsilon and ss are input parameters; B|{B}| is the batch size; WT|{W^*_T}| denotes the cardinality of the largest weight set assignment; and Ξ|{\Xi}| and Θ|{\Theta}| are the cardinalities of the candidate architecture and hyperparameter spaces, respectively.

Keywords

Cite

@article{arxiv.2010.08512,
  title  = {An Approximation Algorithm for Optimal Subarchitecture Extraction},
  author = {Adrian de Wynter},
  journal= {arXiv preprint arXiv:2010.08512},
  year   = {2020}
}

Comments

Preprint. Under review. Original submission does not present the bibliography issues from this version