Related papers: Speeding simulation analysis up with yt and Intel …
The usage of the high-level scripting language Python has enabled new mechanisms for data interrogation, discovery and visualization of scientific data. We present yt, an open source, community-developed astrophysical analysis and…
The analysis of complex multiphysics astrophysical simulations presents a unique and rapidly growing set of challenges: reproducibility, parallelization, and vast increases in data size and complexity chief among them. In order to meet…
In the exascale computing era, handling and analyzing massive datasets have become extremely challenging. In situ analysis, which processes data during simulation runtime and bypasses costly intermediate disk input and output steps, offers…
There are numerous approaches to building analysis applications across the high-energy physics community. Among them are Python-based, or at least Python-driven, analysis workflows. We aim to ease the adoption of a Python-based analysis…
The freud Python package is a powerful library for analyzing simulation data. Written with modern simulation and data analysis workflows in mind, freud provides a Python interface to fast, parallelized C++ routines that run efficiently on…
This paper presents a lightweight, open-source and high-performance python package for solving peridynamics problems in solid mechanics. The development of this solver is motivated by the need for fast analysis tools to achieve the large…
SkyPy is an open-source Python package for simulating the astrophysical sky. It comprises a library of physical and empirical models across a range of observables and a command-line script to run end-to-end simulations. The library provides…
Python is rapidly becoming the lingua franca of machine learning and scientific computing. With the broad use of frameworks such as Numpy, SciPy, and TensorFlow, scientific computing and machine learning are seeing a productivity boost on…
Alloy cluster expansions (CEs) provide an accurate and computationally efficient mapping of the potential energy surface of multi-component systems that enables comprehensive sampling of the many-dimensional configuration space. Here, we…
In this paper, we present resolvent4py, a parallel Python package for the analysis, model reduction and control of large-scale linear systems with millions or billions of degrees of freedom. This package provides the user with a friendly…
Software that processes real-world data or that models a physical system must have some way of managing units. While simple approaches like the understood convention that all data are in a unit system (such as the MKS SI unit system) do…
Despite advancements in the areas of parallel and distributed computing, the complexity of programming on High Performance Computing (HPC) resources has deterred many domain experts, especially in the areas of machine learning and…
Within the last years, Python became more prominent in the scientific community and is now used for simulations, machine learning, and data analysis. All these tasks profit from additional compute power offered by parallelism and…
In recent years, there has been increasing interest in network diffusion models and related problems. The most popular of these are the independent cascade and linear threshold models. Much of the recent experimental work done on these…
Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python…
Existing Python libraries and tools lack the ability to efficiently compute statistical test results for large datasets in the presence of missing values. This presents an issue as soon as constraints on runtime and memory availability…
This paper introduces a novel approach to automatic ahead-of-time (AOT) parallelization and optimization of sequential Python programs for execution on distributed heterogeneous platforms. Our approach enables AOT source-to-source…
Applications are increasingly written as dynamic workflows underpinned by an execution framework that manages asynchronous computations across distributed hardware. However, execution frameworks typically offer one-size-fits-all solutions…
The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory. IDTxl provides functionality to…
The PySCF package has emerged as a powerful and flexible open-source platform for quantum chemistry simulations. However, the efficiency of electronic structure calculations can vary significantly depending on the choice of computational…