English

Scalable Semi-supervised Learning with Graph-based Kernel Machine

Machine Learning 2017-04-07 v3

Abstract

Acquiring labels are often costly, whereas unlabeled data are usually easy to obtain in modern machine learning applications. Semi-supervised learning provides a principled machine learning framework to address such situations, and has been applied successfully in many real-word applications and industries. Nonetheless, most of existing semi-supervised learning methods encounter two serious limitations when applied to modern and large-scale datasets: computational burden and memory usage demand. To this end, we present in this paper the Graph-based semi-supervised Kernel Machine (GKM), a method that leverages the generalization ability of kernel-based method with the geometrical and distributive information formulated through a spectral graph induced from data for semi-supervised learning purpose. Our proposed GKM can be solved directly in the primal form using the Stochastic Gradient Descent method with the ideal convergence rate O(1T)O(\frac{1}{T}). Besides, our formulation is suitable for a wide spectrum of important loss functions in the literature of machine learning (e.g., Hinge, smooth Hinge, Logistic, L1, and {\epsilon}-insensitive) and smoothness functions (i.e., lp(t)=tpl_p(t) = |t|^p with p1p\ge1). We further show that the well-known Laplacian Support Vector Machine is a special case of our formulation. We validate our proposed method on several benchmark datasets to demonstrate that GKM is appropriate for the large-scale datasets since it is optimal in memory usage and yields superior classification accuracy whilst simultaneously achieving a significant computation speed-up in comparison with the state-of-the-art baselines.

Keywords

Cite

@article{arxiv.1606.06793,
  title  = {Scalable Semi-supervised Learning with Graph-based Kernel Machine},
  author = {Trung Le and Khanh Nguyen and Van Nguyen and Vu Nguyen and Dinh Phung},
  journal= {arXiv preprint arXiv:1606.06793},
  year   = {2017}
}

Comments

21 pages

R2 v1 2026-06-22T14:31:08.896Z