English
Related papers

Related papers: Scikit-Multiflow: A Multi-output Streaming Framewo…

200 papers

River is a machine learning library for dynamic data streams and continual learning. It provides multiple state-of-the-art learning methods, data generators/transformers, performance metrics and evaluators for different stream learning…

Mining data streams is a challenge per se. It must be ready to deal with an enormous amount of data and with problems not present in batch machine learning, such as concept drift. Therefore, applying a batch-designed technique, such as…

Machine Learning · Computer Science 2020-08-21 Lucca Portes Cavalheiro , Jean Paul Barddal , Alceu de Souza Britto , Laurent Heutte

scikit-multilearn is a Python library for performing multi-label classification. The library is compatible with the scikit/scipy ecosystem and uses sparse matrices for all internal operations. It provides native Python implementations of…

Machine Learning · Computer Science 2018-12-11 Piotr Szymański , Tomasz Kajdanowicz

Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…

stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows to produce a synthetic data stream that may incorporate…

Machine Learning · Computer Science 2020-01-31 Paweł Ksieniewicz , Paweł Zyblewski

This paper describes HyperStream, a large-scale, flexible and robust software package, written in the Python language, for processing streaming data with workflow creation capabilities. HyperStream overcomes the limitations of other…

Machine Learning · Computer Science 2019-08-09 Tom Diethe , Meelis Kull , Niall Twomey , Kacper Sokol , Hao Song , Miquel Perello-Nieto , Emma Tonkin , Peter Flach

An increasing number of scientific applications rely on stream processing for generating timely insights from data feeds of scientific instruments, simulations, and Internet-of-Thing (IoT) sensors. The development of streaming applications…

Distributed, Parallel, and Cluster Computing · Computer Science 2018-11-13 Andre Luckow , George Chantzialexiou , Shantenu Jha

Due to the unspecified and dynamic nature of data streams, online machine learning requires powerful and flexible solutions. However, evaluating online machine learning methods under realistic conditions is difficult. Existing work…

Machine Learning · Computer Science 2022-04-29 Johannes Haug , Effi Tramountani , Gjergji Kasneci

FastFlow is a structured parallel programming framework targeting shared memory multicores. Its layered design and the optimized implementation of the communication mechanisms used to implement the FastFlow streaming networks provided to…

Distributed, Parallel, and Cluster Computing · Computer Science 2012-04-25 Marco Aldinucci , Marco Danelutto , Massimo Torquati

Shared memory multiprocessors come back to popularity thanks to rapid spreading of commodity multi-core architectures. As ever, shared memory programs are fairly easy to write and quite hard to optimise; providing multi-core programmers…

Distributed, Parallel, and Cluster Computing · Computer Science 2009-09-10 Marco Aldinucci , Massimo Torquati , Massimiliano Meneghin

In the AI-for-science era, scientific computing scenarios such as concurrent learning and high-throughput computing demand a new generation of infrastructure that supports scalable computing resources and automated workflow management on…

Managing data and code in open scientific research is complicated by two key problems: large datasets often cannot be stored alongside code in repository platforms like GitHub, and iterative analysis can lead to unnoticed changes to data,…

Digital Libraries · Computer Science 2023-11-10 Vince Buffalo

Scikit-network is a Python package inspired by scikit-learn for the analysis of large graphs. Graphs are represented by their adjacency matrix in the sparse CSR format of SciPy. The package provides state-of-the-art algorithms for ranking,…

Social and Information Networks · Computer Science 2020-09-17 Thomas Bonald , Nathan de Lara , Quentin Lutz , Bertrand Charpentier

Reusable data/code and reproducible analyses are foundational to quality research. This aspect, however, is often overlooked when designing interactive stream analysis workflows for time-series data (e.g., eye-tracking data). A mechanism to…

Databases · Computer Science 2022-06-20 Yasith Jayawardana , Vikas G. Ashok , Sampath Jayarathna

Streaming Speech-to-Text Translation (StreamST) requires producing translations concurrently with incoming speech, imposing strict latency constraints and demanding models that balance partial-information decision-making with high…

Computation and Language · Computer Science 2025-12-22 Marco Gaido , Sara Papi , Mauro Cettolo , Matteo Negri , Luisa Bentivogli

The proliferation of open-source scientific software for science and research presents opportunities and challenges. In this paper, we introduce the SciCat dataset -- a comprehensive collection of Free-Libre Open Source Software (FLOSS)…

Software Engineering · Computer Science 2023-12-12 Addi Malviya-Thakur , Reed Milewicz , Lavinia Paganini , Ahmed Samir Imam Mahmoud , Audris Mockus

FluidDyn is a project to foster open-science and open-source in the fluid dynamics community. It is thought of as a research project to channel open-source dynamics, methods and tools to do science. We propose a set of Python packages…

Other Computer Science · Computer Science 2019-04-10 Pierre Augier , Ashwin Vishnu Mohanan , Cyrille Bonamy

We introduce SciWING, an open-source software toolkit which provides access to pre-trained models for scientific document processing tasks, inclusive of citation string parsing and logical structure recovery. SciWING enables researchers to…

Digital Libraries · Computer Science 2020-10-26 Abhinav Ramesh Kashyap , Min-Yen Kan

Scikit-learn is an increasingly popular machine learning li- brary. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design…

We present DataFlow, a computational framework for building, testing, and deploying high-performance machine learning systems on unbounded time-series data. Traditional data science workflows assume finite datasets and require substantial…

Machine Learning · Computer Science 2026-01-01 Giacinto Paolo Saggese , Paul Smith
‹ Prev 1 2 3 10 Next ›