English

milearn: A Python Package for Multi-Instance Machine Learning

Machine Learning 2025-12-02 v1

Abstract

We introduce milearn, a Python package for multi-instance learning (MIL) that follows the familiar scikit-learn fit/predict interface while providing a unified framework for both classical and neural-network-based MIL algorithms for regression and classification. The package also includes built-in hyperparameter optimization designed specifically for small MIL datasets, enabling robust model selection in data-scarce scenarios. We demonstrate the versatility of milearn across a broad range of synthetic MIL benchmark datasets, including digit classification and regression, molecular property prediction, and protein-protein interaction (PPI) prediction. Special emphasis is placed on the key instance detection (KID) problem, for which the package provides dedicated support.

Keywords

Cite

@article{arxiv.2512.01287,
  title  = {milearn: A Python Package for Multi-Instance Machine Learning},
  author = {Dmitry Zankov and Pavlo Polishchuk and Michal Sobieraj and Mario Barbatti},
  journal= {arXiv preprint arXiv:2512.01287},
  year   = {2025}
}

Comments

Open-source software for multi-instance machine learning