Related papers: NApy: Efficient Statistics in Python for Large-Sca…
We introduce FDApy, an open-source Python package for the analysis of functional data. The package provides tools for the representation of (multivariate) functional data defined on different dimensional domains and for functional data that…
Robust estimation provides essential tools for analyzing data that contain outliers, ensuring that statistical models remain reliable even in the presence of some anomalous data. While robust methods have long been available in R, users of…
Space-filling experimental design techniques are commonly used in many computer modeling and simulation studies to explore the effects of inputs on outputs. This research presents raxpy, a Python package that leverages expressive annotation…
Behavioral studies using personal digital devices typically produce rich longitudinal datasets of mixed data types. These data provide information about the behavior of users of these devices in real-time and in the users' natural…
Raman spectroscopy is a non-destructive and label-free chemical analysis technique, which plays a key role in the analysis and discovery cycle of various branches of science. Nonetheless, progress in Raman spectroscopic analysis is still…
A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…
Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and…
DADApy is a python software package for analysing and characterising high-dimensional data manifolds. It provides methods for estimating the intrinsic dimension and the probability density, for performing density-based clustering and for…
Python is rapidly becoming the lingua franca of machine learning and scientific computing. With the broad use of frameworks such as Numpy, SciPy, and TensorFlow, scientific computing and machine learning are seeing a productivity boost on…
Compound AI applications, which compose calls to ML models using a general-purpose programming language like Python, are widely used for a variety of user-facing tasks, from software engineering to enterprise automation, making their…
In this paper, we present resolvent4py, a parallel Python package for the analysis, model reduction and control of large-scale linear systems with millions or billions of degrees of freedom. This package provides the user with a friendly…
PaPy, which stands for parallel pipelines in Python, is a highly flexible framework that enables the construction of robust, scalable workflows for either generating or processing voluminous datasets. A workflow is created from user-written…
In this paper, we present MusPy, an open source Python library for symbolic music generation. MusPy provides easy-to-use tools for essential components in a music generation system, including dataset management, data I/O, data preprocessing…
Incomplete data is a persistent challenge in real-world datasets, often governed by complex and unobservable missing mechanisms. Simulating missingness has become a standard approach for understanding its impact on learning and analysis.…
Bayesian Networks (BNs) are used in various fields for modeling, prediction, and decision making. pgmpy is a python package that provides a collection of algorithms and tools to work with BNs and related models. It implements algorithms for…
Flaky tests obstruct software development, and studying and proposing mitigations against them has therefore become an important focus of software engineering research. To conduct sound investigations on test flakiness, it is crucial to…
We present an overview of Sherpa, an open source Python project, and discuss its development history, broad design concepts and capabilities. Sherpa contains powerful tools for combining parametric models into complex expressions that can…
Developing efficient parallel applications is critical to advancing scientific development but requires significant performance analysis and optimization. Performance analysis tools help developers manage the increasing complexity and scale…
Magpy is a C++ accelerated Python package for modelling and simulating the magnetic dynamics of nano-sized particles. Nanoparticles are modelled as a system of three-dimensional macrospins and simulated with a set of coupled stochastic…
In this paper we present SurvLIMEpy, an open-source Python package that implements the SurvLIME algorithm. This method allows to compute local feature importance for machine learning algorithms designed for modelling Survival Analysis data.…