Related papers: HiClass: a Python library for local hierarchical c…
scikit-multilearn is a Python library for performing multi-label classification. The library is compatible with the scikit/scipy ecosystem and uses sparse matrices for all internal operations. It provides native Python implementations of…
This paper presents the HiPart package, an open-source native python library that provides efficient and interpret-able implementations of divisive hierarchical clustering algorithms. HiPart supports interactive visualizations for the…
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…
GitHub has become an important platform for code sharing and scientific exchange. With the massive number of repositories available, there is a pressing need for topic-based search. Even though the topic label functionality has been…
The democratization of Data Mining has been widely successful thanks in part to powerful and easy-to-use Machine Learning libraries. These libraries have been particularly tailored to tackle Supervised Learning. However, strong supervision…
Hierarchical text classification (HTC) is a special sub-task of multi-label classification (MLC) whose taxonomy is constructed as a tree and each sample is assigned with at least one path in the tree. Latest HTC models contain three…
Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design…
Hyperbox-based machine learning algorithms are an important and popular branch of machine learning in the construction of classifiers using fuzzy sets and logic theory and neural network architectures. This type of learning is characterised…
Whole-slide image analysis is essential for diagnostic tasks in pathology, yet existing deep learning methods primarily rely on flat classification, ignoring hierarchical relationships among class labels. In this study, we propose HiClass,…
Scikit-network is a Python package inspired by scikit-learn for the analysis of large graphs. Graphs are represented by their adjacency matrix in the sparse CSR format of SciPy. The package provides state-of-the-art algorithms for ranking,…
We present srlearn, a Python library for boosted statistical relational models. We adapt the scikit-learn interface to this setting and provide examples for how this can be used to express learning and inference problems.
This paper introduces scikit-hubness, a Python package for efficient nearest neighbor search in high-dimensional spaces. Hubness is an aspect of the curse of dimensionality, and is known to impair various learning tasks, including…
Large collections of time series data are commonly organized into structures with different levels of aggregation; examples include product and geographical groupings. It is often important to ensure that the forecasts are coherent so that…
Imbalanced-learn is an open-source python toolbox aiming at providing a wide range of methods to cope with the problem of imbalanced dataset frequently encountered in machine learning and pattern recognition. The implemented…
The classification problem's complexity assessment is an essential element of many topics in the supervised learning domain. It plays a significant role in meta-learning -- becoming the basis for determining meta-attributes or…
We present Qiskit Machine Learning (ML), a high-level Python library that combines elements of quantum computing with traditional machine learning. The API abstracts Qiskit's primitives to facilitate interactions with classical simulators…
stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows to produce a synthetic data stream that may incorporate…
We introduce small-text, an easy-to-use active learning library, which offers pool-based active learning for single- and multi-label text classification in Python. It features numerous pre-implemented state-of-the-art query strategies,…
Context: Classification of software requirements into different categories is a critically important task in requirements engineering (RE). Developing machine learning (ML) approaches for requirements classification has attracted great…
Libraries for supervised classification have enabled the wide-spread usage of machine learning methods. Existing libraries, such as scikit-learn, caret, and mlpack, implement techniques based on the classical empirical risk minimization…