Related papers: problexity -- an open-source Python library for bi…

How Complex is your classification problem? A survey on measuring classification complexity

Characteristics extracted from the training datasets of classification problems have proven to be effective predictors in a number of meta-analyses. Among them, measures of classification complexity can be used to estimate the difficulty in…

Machine Learning · Computer Science 2021-01-01 Ana C. Lorena , Luís P. F. Garcia , Jens Lehmann , Marcilio C. P. Souto , Tin K. Ho

A scikit-based Python environment for performing multi-label classification

scikit-multilearn is a Python library for performing multi-label classification. The library is compatible with the scikit/scipy ecosystem and uses sparse matrices for all internal operations. It provides native Python implementations of…

Machine Learning · Computer Science 2018-12-11 Piotr Szymański , Tomasz Kajdanowicz

ComplexityNet: Increasing LLM Inference Efficiency by Learning Task Complexity

We present ComplexityNet, a streamlined language model designed for assessing task complexity. This model predicts the likelihood of accurate output by various language models, each with different capabilities. Our initial application of…

Computation and Language · Computer Science 2024-10-16 Henry Bae , Aghyad Deeb , Alex Fleury , Kehang Zhu

libcll: an Extendable Python Toolkit for Complementary-Label Learning

Complementary-label learning (CLL) is a weakly supervised learning paradigm for multiclass classification, where only complementary labels -- indicating classes an instance does not belong to -- are provided to the learning algorithm.…

Machine Learning · Computer Science 2024-11-20 Nai-Xuan Ye , Tan-Ha Mai , Hsiu-Hsuan Wang , Wei-I Lin , Hsuan-Tien Lin

biquality-learn: a Python library for Biquality Learning

The democratization of Data Mining has been widely successful thanks in part to powerful and easy-to-use Machine Learning libraries. These libraries have been particularly tailored to tackle Supervised Learning. However, strong supervision…

Machine Learning · Computer Science 2023-08-21 Pierre Nodet , Vincent Lemaire , Alexis Bondu , Antoine Cornuéjols

learn2learn: A Library for Meta-Learning Research

Meta-learning researchers face two fundamental issues in their empirical work: prototyping and reproducibility. Researchers are prone to make mistakes when prototyping new algorithms and tasks because modern meta-learning methods rely on…

Machine Learning · Computer Science 2020-08-31 Sébastien M. R. Arnold , Praateek Mahajan , Debajyoti Datta , Ian Bunner , Konstantinos Saitas Zarkias

Evaluating Code Reasoning Abilities of Large Language Models Under Real-World Settings

Code reasoning tasks are becoming prevalent in large language model (LLM) assessments. Yet, there is a dearth of studies on the impact of real-world complexities on code reasoning, e.g., inter- or intra-procedural dependencies, API calls,…

Software Engineering · Computer Science 2026-04-27 Changshu Liu , Alireza Ghazanfari , Yang Chen , Reyhaneh Jabbarvand

metric-learn: Metric Learning Algorithms in Python

metric-learn is an open source Python package implementing supervised and weakly-supervised distance metric learning algorithms. As part of scikit-learn-contrib, it provides a unified interface compatible with scikit-learn which allows to…

Machine Learning · Computer Science 2020-07-28 William de Vazelhes , CJ Carey , Yuan Tang , Nathalie Vauquier , Aurélien Bellet

pyRecLab: A Software Library for Quick Prototyping of Recommender Systems

This paper introduces pyRecLab, a software library written in C++ with Python bindings which allows to quickly train, test and develop recommender systems. Although there are several software libraries for this purpose, only a few let…

Software Engineering · Computer Science 2017-07-12 Gabriel Sepulveda , Vicente Dominguez , Denis Parra

abess: A Fast Best Subset Selection Library in Python and R

We introduce a new library named abess that implements a unified framework of best-subset selection for solving diverse machine learning problems, e.g., linear regression, classification, and principal component analysis. Particularly, the…

Machine Learning · Statistics 2024-04-02 Jin Zhu , Xueqin Wang , Liyuan Hu , Junhao Huang , Kangkang Jiang , Yanhang Zhang , Shiyun Lin , Junxian Zhu

Scikit-learn: Machine Learning in Python

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…

Machine Learning · Computer Science 2018-06-06 Fabian Pedregosa , Gaël Varoquaux , Alexandre Gramfort , Vincent Michel , Bertrand Thirion , Olivier Grisel , Mathieu Blondel , Andreas Müller , Joel Nothman , Gilles Louppe , Peter Prettenhofer , Ron Weiss , Vincent Dubourg , Jake Vanderplas , Alexandre Passos , David Cournapeau , Matthieu Brucher , Matthieu Perrot , Édouard Duchesnay

dalex: Responsible Machine Learning with Interactive Explainability and Fairness in Python

The increasing amount of available data, computing power, and the constant pursuit for higher performance results in the growing complexity of predictive models. Their black-box nature leads to opaqueness debt phenomenon inflicting…

Machine Learning · Computer Science 2021-10-12 Hubert Baniecki , Wojciech Kretowicz , Piotr Piatyszek , Jakub Wisniewski , Przemyslaw Biecek

A metric for software vulnerabilities classification

Vulnerability discovery and exploits detection are two wide areas of study in software engineering. This preliminary work tries to combine existing methods with machine learning techniques to define a metric classification of vulnerable…

Software Engineering · Computer Science 2014-07-23 Gabriele Modena

scikit-hubness: Hubness Reduction and Approximate Neighbor Search

This paper introduces scikit-hubness, a Python package for efficient nearest neighbor search in high-dimensional spaces. Hubness is an aspect of the curse of dimensionality, and is known to impair various learning tasks, including…

Machine Learning · Computer Science 2021-01-12 Roman Feldbauer , Thomas Rattei , Arthur Flexer

Serenity: Library Based Python Code Analysis for Code Completion and Automated Machine Learning

Dynamically typed languages such as Python have become very popular. Among other strengths, Python's dynamic nature and its straightforward linking to native code have made it the de-facto language for many research areas such as Artificial…

Programming Languages · Computer Science 2023-01-13 Wenting Zhao , Ibrahim Abdelaziz , Julian Dolby , Kavitha Srinivas , Mossad Helali , Essam Mansour

Complexity Metric for Code-Mixed Social Media Text

An evaluation metric is an absolute necessity for measuring the performance of any system and complexity of any data. In this paper, we have discussed how to determine the level of complexity of code-mixed social media texts that are…

Computation and Language · Computer Science 2017-07-06 Souvick Ghosh , Satanu Ghosh , Dipankar Das

CLASSify: A Web-Based Tool for Machine Learning

Machine learning classification problems are widespread in bioinformatics, but the technical knowledge required to perform model training, optimization, and inference can prevent researchers from utilizing this technology. This article…

Machine Learning · Computer Science 2023-10-06 Aaron D. Mullen , Samuel E. Armstrong , Jeff Talbert , V. K. Cody Bumgardner

Characterizing instance hardness in classification and regression problems

Some recent pieces of work in the Machine Learning (ML) literature have demonstrated the usefulness of assessing which observations are hardest to have their label predicted accurately. By identifying such instances, one may inspect whether…

Machine Learning · Computer Science 2022-12-06 Gustavo P. Torquette , Victor S. Nunes , Pedro Y. A. Paiva , Lourenço B. C. Neto , Ana C. Lorena

srlearn: A Python Library for Gradient-Boosted Statistical Relational Models

We present srlearn, a Python library for boosted statistical relational models. We adapt the scikit-learn interface to this setting and provide examples for how this can be used to express learning and inference problems.

Machine Learning · Computer Science 2019-12-19 Alexander L. Hayes

The Multiplex Classification Framework: optimizing multi-label classifiers through problem transformation, ontology engineering, and model ensembling

Classification is a fundamental task in machine learning. While conventional methods-such as binary, multiclass, and multi-label classification-are effective for simpler problems, they may not adequately address the complexities of some…

Machine Learning · Computer Science 2024-12-20 Mauro Nievas Offidani , Facundo Roffet , Claudio Augusto Delrieux , Maria Carolina Gonzalez Galtier , Marcos Zarate