Mathematical Software
In this work, we present two parallel algorithms for the large-scale discrete Fourier transform (DFT) on Tensor Processing Unit (TPU) clusters. The two parallel algorithms are associated with two formulations of DFT: one is based on the…
RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider as well as at $B$ factories. Larger datasets to be collected at e.g. the High-Luminosity LHC will…
A blend of two Taylor series for the same smooth real- or complex-valued function of a single variable can be useful for approximation. We use an explicit formula for a two-point Hermite interpolational polynomial to construct such blends.…
Numerical solutions of hyperbolic partial differential equations(PDEs) are ubiquitous in science and engineering. Method of lines is a popular approach to discretize PDEs defined in spacetime, where space and time are discretized…
This paper presents a recently developed particle simulation code package PIFE-PIC, which is a novel three-dimensional (3-D) Parallel Immersed-Finite-Element (IFE) Particle-in-Cell (PIC) simulation model for particle simulations of…
Given the importance of floating-point~(FP) performance in numerous domains, several new variants of FP and its alternatives have been proposed (e.g., Bfloat16, TensorFloat32, and Posits). These representations do not have correctly rounded…
Crew Pairing Optimization aims at generating a set of flight sequences (crew pairings), covering all flights in an airline's flight schedule, at minimum cost, while satisfying several legality constraints. CPO is critically important for…
The generalized singular value decomposition (GSVD, a.k.a. "SVD triplet", "duality diagram" approach) provides a unified strategy and basis to perform nearly all of the most common multivariate analyses (e.g., principal components,…
This work is a rigorous development of a complete and general-purpose deep learning framework from the ground up. The fundamental components of deep learning - automatic differentiation and gradient methods of optimizing multivariable…
The numerical solution of partial differential equations is at the heart of many grand challenges in supercomputing. Solvers based on high-order discontinuous Galerkin (DG) discretisation have been shown to scale on large supercomputers…
Calcium is a C library for real and complex numbers in a form suitable for exact algebraic and symbolic computation. Numbers are represented as elements of fields $\mathbb{Q}(a_1,\ldots,a_n)$ where the extensions numbers $a_k$ may be…
"Code Generation for Generally Mapped Finite Elements" includes performance results for the finite element methods discussed in that manuscript. The authors provided a Zenodo archive with the Firedrake components and dependencies used, as…
DSLib is an open-source implementation of the Dominant Set (DS) clustering algorithm written entirely in Matlab. The DS method is a graph-based clustering technique rooted in the evolutionary game theory that starts gaining lots of interest…
Stencil computations represent a very common class of nested loops in scientific and engineering applications. Exploiting vector units in modern CPUs is crucial to achieving peak performance. Previous vectorization approaches often consider…
ParaMonte::Python (standing for Parallel Monte Carlo in Python) is a serial and MPI-parallelized library of (Markov Chain) Monte Carlo (MCMC) routines for sampling mathematical objective functions, in particular, the posterior distributions…
Scipp is heavily inspired by the Python library xarray. It enriches raw NumPy-like multi-dimensional arrays of data by adding named dimensions and associated coordinates. Multiple arrays are combined into datasets. On top of this, scipp…
Sparse general matrix-matrix multiplication (spGEMM) is an essential component in many scientific and data analytics applications. However, the sparsity pattern of the input matrices and the interaction of their patterns make spGEMM…
ParaMonte (standing for Parallel Monte Carlo) is a serial and MPI/Coarray-parallelized library of Monte Carlo routines for sampling mathematical objective functions of arbitrary-dimensions, in particular, the posterior distributions of…
Iterative methods for solving large sparse systems of linear equations are widely used in many HPC applications. Extreme scaling of these methods can be difficult, however, since global communication to form dot products is typically…
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear systems. To a large extent, the performance of practical realizations of these methods is constrained by the…