English

Knowledge Distillation for Multi-task Learning

Computer Vision and Pattern Recognition 2020-09-25 v2

Abstract

Multi-task learning (MTL) is to learn one single model that performs multiple tasks for achieving good performance on all tasks and lower cost on computation. Learning such a model requires to jointly optimize losses of a set of tasks with different difficulty levels, magnitudes, and characteristics (e.g. cross-entropy, Euclidean loss), leading to the imbalance problem in multi-task learning. To address the imbalance problem, we propose a knowledge distillation based method in this work. We first learn a task-specific model for each task. We then learn the multi-task model for minimizing task-specific loss and for producing the same feature with task-specific models. As the task-specific network encodes different features, we introduce small task-specific adaptors to project multi-task features to the task-specific features. In this way, the adaptors align the task-specific feature and the multi-task feature, which enables a balanced parameter sharing across tasks. Extensive experimental results demonstrate that our method can optimize a multi-task learning model in a more balanced way and achieve better overall performance.

Keywords

Cite

@article{arxiv.2007.06889,
  title  = {Knowledge Distillation for Multi-task Learning},
  author = {Wei-Hong Li and Hakan Bilen},
  journal= {arXiv preprint arXiv:2007.06889},
  year   = {2020}
}

Comments

We propose a knowledge distillation method for addressing the imbalance problem in multi-task learning