English

Fast Parallel SVM using Data Augmentation

Machine Learning 2015-12-25 v1

Abstract

As one of the most popular classifiers, linear SVMs still have challenges in dealing with very large-scale problems, even though linear or sub-linear algorithms have been developed recently on single machines. Parallel computing methods have been developed for learning large-scale SVMs. However, existing methods rely on solving local sub-optimization problems. In this paper, we develop a novel parallel algorithm for learning large-scale linear SVM. Our approach is based on a data augmentation equivalent formulation, which casts the problem of learning SVM as a Bayesian inference problem, for which we can develop very efficient parallel sampling methods. We provide empirical results for this parallel sampling SVM, and provide extensions for SVR, non-linear kernels, and provide a parallel implementation of the Crammer and Singer model. This approach is very promising in its own right, and further is a very useful technique to parallelize a broader family of general maximum-margin models.

Keywords

Cite

@article{arxiv.1512.07716,
  title  = {Fast Parallel SVM using Data Augmentation},
  author = {Hugh Perkins and Minjie Xu and Jun Zhu and Bo Zhang},
  journal= {arXiv preprint arXiv:1512.07716},
  year   = {2015}
}
R2 v1 2026-06-22T12:17:20.072Z