English

Cross-validation-based optimal feature selection for linear SVM classification

Optimization and Control 2026-05-11 v1

Abstract

This paper addresses feature subset selection for Support Vector Machines (SVMs) based on the cross-validation criterion. Unlike statistical criteria such as the Akaike information criterion (AIC) and the Bayesian information criterion (BIC), cross-validation requires only the mild assumption that samples are independently and identically distributed (i.i.d.). For this reason, the cross-validation criterion is expected to work well across a wide range of prediction problems, and it has already demonstrated its usefulness as a feature subset selection method for regression. The objective of this paper is to extend the framework of best feature subset selection via the cross-validation criterion to SVM classification problems. This subset-selection problem can be formulated as a bilevel mixed-integer optimization problem. Because bilevel optimization problems are generally hard to solve, we introduce the Least Squares Support Vector Machine (LS-SVM), whose optimality conditions admit a closed-form expression, and reduce the problem to a single-level mixed-integer optimization problem. This reformulation allows us to solve the problem using standard optimization software. We evaluate the proposed framework through simulation experiments that compare it with a regularization-based method (L1-regularization), a sequential search method (recursive feature elimination), and mixed-integer optimization (MIO) based on statistical criteria. The results show that the proposed framework achieves favorable performance both in classification accuracy and feature selection accuracy.

Keywords

Cite

@article{arxiv.2605.07089,
  title  = {Cross-validation-based optimal feature selection for linear SVM classification},
  author = {Masaharu Mori and Shunnosuke Ikeda and Ryuta Tamura and Yuichi Takano and Ryuhei Miyashiro},
  journal= {arXiv preprint arXiv:2605.07089},
  year   = {2026}
}

Comments

18 pages, 4 figures, 1 table