English

A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning

Statistics Theory 2019-05-22 v2 Statistics Theory

Abstract

Hyperparameters tuning and model selection are important steps in machine learning. Unfortunately, classical hyperparameter calibration and model selection procedures are sensitive to outliers and heavy-tailed data. In this work, we construct a selection procedure which can be seen as a robust alternative to cross-validation and is based on a median-of-means principle. Using this procedure, we also build an ensemble method which, trained with algorithms and corrupted heavy-tailed data, selects an algorithm, trains it with a large uncorrupted subsample and automatically tune its hyperparameters. The construction relies on a divide-and-conquer methodology, making this method easily scalable for autoML given a corrupted database. This method is tested with the LASSO which is known to be highly sensitive to outliers.

Keywords

Cite

@article{arxiv.1812.02435,
  title  = {A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning},
  author = {Joon Kwon and Guillaume Lecué and Matthieu Lerasle},
  journal= {arXiv preprint arXiv:1812.02435},
  year   = {2019}
}

Comments

17 pages, 3 figures