Kernelized Classification in Deep Networks
Abstract
We propose a kernelized classification layer for deep networks. Although conventional deep networks introduce an abundance of nonlinearity for representation (feature) learning, they almost universally use a linear classifier on the learned feature vectors. We advocate a nonlinear classification layer by using the kernel trick on the softmax cross-entropy loss function during training and the scorer function during testing. However, the choice of the kernel remains a challenge. To tackle this, we theoretically show the possibility of optimizing over all possible positive definite kernels applicable to our problem setting. This theory is then used to device a new kernelized classification layer that learns the optimal kernel function for a given problem automatically within the deep network itself. We show the usefulness of the proposed nonlinear classification layer on several datasets and tasks.
Cite
@article{arxiv.2012.09607,
title = {Kernelized Classification in Deep Networks},
author = {Sadeep Jayasumana and Srikumar Ramalingam and Sanjiv Kumar},
journal= {arXiv preprint arXiv:2012.09607},
year = {2021}
}