milearn: A Python Package for Multi-Instance Machine Learning
Abstract
We introduce milearn, a Python package for multi-instance learning (MIL) that follows the familiar scikit-learn fit/predict interface while providing a unified framework for both classical and neural-network-based MIL algorithms for regression and classification. The package also includes built-in hyperparameter optimization designed specifically for small MIL datasets, enabling robust model selection in data-scarce scenarios. We demonstrate the versatility of milearn across a broad range of synthetic MIL benchmark datasets, including digit classification and regression, molecular property prediction, and protein-protein interaction (PPI) prediction. Special emphasis is placed on the key instance detection (KID) problem, for which the package provides dedicated support.
Cite
@article{arxiv.2512.01287,
title = {milearn: A Python Package for Multi-Instance Machine Learning},
author = {Dmitry Zankov and Pavlo Polishchuk and Michal Sobieraj and Mario Barbatti},
journal= {arXiv preprint arXiv:2512.01287},
year = {2025}
}
Comments
Open-source software for multi-instance machine learning