Learning Kernel-Based Halfspaces with the Zero-One Loss

Shai Shalev-Shwartz; Ohad Shamir; Karthik Sridharan

Learning Kernel-Based Halfspaces with the Zero-One Loss

Machine Learning 2010-08-03 v2

Authors: Shai Shalev-Shwartz , Ohad Shamir , Karthik Sridharan

Abstract

We describe and analyze a new algorithm for agnostically learning kernel-based halfspaces with respect to the \emph{zero-one} loss function. Unlike most previous formulations which rely on surrogate convex loss functions (e.g. hinge-loss in SVM and log-loss in logistic regression), we provide finite time/sample guarantees with respect to the more natural zero-one loss function. The proposed algorithm can learn kernel-based halfspaces in worst-case time $\poly(\exp(L\log(L/\epsilon)))$ , for $\emph{any}$ distribution, where $L$ is a Lipschitz constant (which can be thought of as the reciprocal of the margin), and the learned classifier is worse than the optimal halfspace by at most $\epsilon$ . We also prove a hardness result, showing that under a certain cryptographic assumption, no algorithm can learn kernel-based halfspaces in time polynomial in $L$ .

Keywords

computational learning theory kernel learning approximation algorithm

Cite

@article{arxiv.1005.3681,
  title  = {Learning Kernel-Based Halfspaces with the Zero-One Loss},
  author = {Shai Shalev-Shwartz and Ohad Shamir and Karthik Sridharan},
  journal= {arXiv preprint arXiv:1005.3681},
  year   = {2010}
}

Comments

This is a full version of the paper appearing in the 23rd International Conference on Learning Theory (COLT 2010). Compared to the previous arXiv version, this version contains some small corrections in the proof of Lemma 3 and in appendix A

Learning Kernel-Based Halfspaces with the Zero-One Loss

Abstract

Keywords

Cite

Comments

Related papers