Related papers: PyPackIT: Automated Research Software Engineering …
Scientific advancement relies on the ability to share and reproduce results. When data analysis or calculations are carried out using software written by scientists there are special challenges around code versions, quality and code…
The principles of automation and innovation serve as foundational elements for advancement in contemporary science and technology. Here, we introduce Pygen, an automation platform designed to empower researchers, technologists, and…
Any cutting-edge scientific research project requires a myriad of computational tools for data generation, management, analysis and visualization. Python is a flexible and extensible scientific programming platform that offered the perfect…
As neuroimaging databases grow in size and complexity, the time researchers spend investigating and managing the data increases to the expense of data analysis. As a result, investigators rely more and more heavily on scripting using…
Despite the explosion of interest in healthcare AI research, the reproducibility and benchmarking of those research works are often limited due to the lack of standard benchmark datasets and diverse evaluation metrics. To address this…
Scikit-HEP is a community-driven and community-oriented project with the goal of providing an ecosystem for particle physics data analysis in Python. Scikit-HEP is a toolset of approximately twenty packages and a few "affiliated" packages.…
Difficulty replicating baselines, high computational costs, and required domain expertise create persistent barriers to clinical AI research. To address these challenges, we introduce PyHealth 2.0, an enhanced clinical deep learning toolkit…
A large amount of data is produced every second from modern information systems such as mobile devices, the world wide web, Internet of Things, social media, etc. Analysis and mining of this massive data requires a lot of advanced tools and…
The proliferation of open-source scientific software for science and research presents opportunities and challenges. In this paper, we introduce the SciCat dataset -- a comprehensive collection of Free-Libre Open Source Software (FLOSS)…
Critical goals of scientific computing are to increase scientific rigor, reproducibility, and transparency while keeping up with ever-increasing computational demands. This work presents an integrated framework well-suited for data…
In this work, we present scikit-fingerprints, a Python package for computation of molecular fingerprints for applications in chemoinformatics. Our library offers an industry-standard scikit-learn interface, allowing intuitive usage and easy…
We introduce SciWING, an open-source software toolkit which provides access to pre-trained models for scientific document processing tasks, inclusive of citation string parsing and logical structure recovery. SciWING enables researchers to…
Desktop GIS applications, such as ArcGIS and QGIS, provide tools essential for conducting suitability analysis, an activity that is central in formulating a land-use plan. But, when it comes to building complicated land-use suitability…
Cloud platforms allow users to execute tasks directly from their web browser and are a key enabling technology not only for commerce but also for computational science. Research software is often developed by scientists with limited…
The emergence of data-driven computational materials science offers unprecedented opportunities to explore complex material landscapes, complementing experimental research with the discovery of novel compounds. To enable these developments,…
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a…
We introduce SciEvalKit, a unified benchmarking toolkit designed to evaluate AI models for science across a broad range of scientific disciplines and task capabilities. Unlike general-purpose evaluation platforms, SciEvalKit focuses on the…
The recomputability and reproducibility of results from scientific software requires access to both the source code and all associated input and output data. However, the full collection of these resources often does not accompany the key…
Modern science clearly demands for a higher level of reproducibility and collaboration. To make research fully reproducible one has to take care of several aspects: research protocol description, data access, environment preservation,…
A large number of computational scientific research projects make use of open source software packages. However, the development process of such tools frequently differs from conventional software development; partly because of the nature…