数学软件 — Scifaro

Optimizing Block-Sparse Matrix Multiplications on CUDA with TVM

We implemented and optimized matrix multiplications between dense and block-sparse matrices on CUDA. We leveraged TVM, a deep learning compiler, to explore the schedule space of the operation and generate efficient CUDA code. With the…

数学软件 · 计算机科学 2020-07-28 Zijing Gu

Making RooFit Ready for Run 3

RooFit and RooStats, the toolkits for statistical modelling in ROOT, are used in most searches and measurements at the Large Hadron Collider. The data to be collected in Run 3 will enable measurements with higher precision and models with…

数学软件 · 计算机科学 2020-07-27 Stephan Hageboeck , Lorenzo Moneta

Characteristics-based Simulink implementation of first-order quasilinear partial differential equations

The paper deals with solving first-order quasilinear partial differential equations in an online simulation environment, such as Simulink, utilizing the well-known and well-recommended method of characteristics. Compared to the commonly…

数学软件 · 计算机科学 2020-07-27 Anton Ponomarev , Julian Hofmann , Lutz Gröll

Approaches to the implementation of generalized complex numbers in the Julia language

In problems of mathematical physics, to study the structures of spaces using the Cayley-Klein models in theoretical calculations, the use of generalized complex numbers is required. In the case of computational experiments, such tasks…

数学软件 · 计算机科学 2020-07-21 Migran N. Gevorkyan , Anna V. Korolkova , Dmitry S. Kulyabov

Accelerating Geometric Multigrid Preconditioning with Half-Precision Arithmetic on GPUs

With the hardware support for half-precision arithmetic on NVIDIA V100 GPUs, high-performance computing applications can benefit from lower precision at appropriate spots to speed up the overall execution time. In this paper, we investigate…

数学软件 · 计算机科学 2020-07-16 Kyaw L. Oo , Andreas Vogel

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic

Within the past years, hardware vendors have started designing low precision special function units in response to the demand of the Machine Learning community and their demand for high compute power in low precision formats. Also the…

数学软件 · 计算机科学 2020-07-15 Ahmad Abdelfattah , Hartwig Anzt , Erik G. Boman , Erin Carson , Terry Cojean , Jack Dongarra , Mark Gates , Thomas Grützmacher , Nicholas J. Higham , Sherry Li , Neil Lindquist , Yang Liu , Jennifer Loe , Piotr Luszczek , Pratik Nayak , Sri Pranesh , Siva Rajamanickam , Tobias Ribizel , Barry Smith , Kasia Swirydowicz , Stephen Thomas , Stanimire Tomov , Yaohung M. Tsai , Ichitaro Yamazaki , Urike Meier Yang

MFEM: a modular finite element methods library

MFEM is an open-source, lightweight, flexible and scalable C++ library for modular finite element methods that features arbitrary high-order finite element meshes and spaces, support for a wide variety of discretization approaches and…

数学软件 · 计算机科学 2020-07-15 Robert Anderson , Julian Andrej , Andrew Barker , Jamie Bramwell , Jean-Sylvain Camier , Jakub Cerveny , Veselin Dobrev , Yohann Dudouit , Aaron Fisher , Tzanio Kolev , Will Pazner , Mark Stowell , Vladimir Tomov , Johann Dahm , David Medina , Stefano Zampini

ExaHyPE: An Engine for Parallel Dynamically Adaptive Simulations of Wave Problems

ExaHyPE ("An Exascale Hyperbolic PDE Engine") is a software engine for solving systems of first-order hyperbolic partial differential equations (PDEs). Hyperbolic PDEs are typically derived from the conservation laws of physics and are…

数学软件 · 计算机科学 2020-07-15 Anne Reinarz , Dominic E. Charrier , Michael Bader , Luke Bovard , Michael Dumbser , Kenneth Duru , Francesco Fambri , Alice-Agnes Gabriel , Jean-Matthieu Gallard , Sven Köppel , Lukas Krenz , Leonhard Rannabauer , Luciano Rezzolla , Philipp Samfass , Maurizio Tavelli , Tobias Weinzierl

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI…

数学软件 · 计算机科学 2020-07-09 Grey Ballard , Alicia Klinvex , Tamara G. Kolda

SParSH-AMG: A library for hybrid CPU-GPU algebraic multigrid and preconditioned iterative methods

Hybrid CPU-GPU algorithms for Algebraic Multigrid methods (AMG) to efficiently utilize both CPU and GPU resources are presented. In particular, hybrid AMG framework focusing on minimal utilization of GPU memory with performance on par with…

数学软件 · 计算机科学 2020-07-02 Sashikumaar Ganesan , Manan Shah

Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing

In this paper, we present Ginkgo, a modern C++ math library for scientific high performance computing. While classical linear algebra libraries act on matrix and vector objects, Ginkgo's design principle abstracts all functionality as…

数学软件 · 计算机科学 2020-07-02 Hartwig Anzt , Terry Cojean , Goran Flegar , Fritz Göbel , Thomas Grützmacher , Pratik Nayak , Tobias Ribizel , Yuhsiang Mike Tsai , Enrique S. Quintana-Ortí

Hierarchical Jacobi Iteration for Structured Matrices on GPUs using Shared Memory

High fidelity scientific simulations modeling physical phenomena typically require solving large linear systems of equations which result from discretization of a partial differential equation (PDE) by some numerical method. This step often…

数学软件 · 计算机科学 2020-07-01 Mohammad Shafaet Islam , Qiqi Wang

Automatic Generation of Efficient Sparse Tensor Format Conversion Routines

This paper shows how to generate code that efficiently converts sparse tensors between disparate storage formats (data layouts) such as CSR, DIA, ELL, and many others. We decompose sparse tensor conversion into three logical phases:…

数学软件 · 计算机科学 2020-07-01 Stephen Chou , Fredrik Kjolstad , Saman Amarasinghe

Preparing Ginkgo for AMD GPUs -- A Testimonial on Porting CUDA Code to HIP

With AMD reinforcing their ambition in the scientific high performance computing ecosystem, we extend the hardware scope of the Ginkgo linear algebra package to feature a HIP backend for AMD GPUs. In this paper, we report and discuss the…

数学软件 · 计算机科学 2020-06-26 Yuhsiang M. Tsai , Terry Cojean , Tobias Ribizel , Hartwig Anzt

Performance Engineering for Real and Complex Tall & Skinny Matrix Multiplication Kernels on GPUs

General matrix-matrix multiplications with double-precision real and complex entries (DGEMM and ZGEMM) in vendor-supplied BLAS libraries are best optimized for square matrices but often show bad performance for tall & skinny matrices, which…

数学软件 · 计算机科学 2020-06-25 Dominik Ernst , Georg Hager , Jonas Thies , Gerhard Wellein

Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs

Chebyshev filter diagonalization is well established in quantum chemistry and quantum physics to compute bulks of eigenvalues of large sparse matrices. Choosing a block vector implementation, we investigate optimization opportunities on the…

数学软件 · 计算机科学 2020-06-25 Moritz Kreutzer , Georg Hager , Dominik Ernst , Holger Fehske , Alan R. Bishop , Gerhard Wellein

RationalizeRoots: Software Package for the Rationalization of Square Roots

The computation of Feynman integrals often involves square roots. One way to obtain a solution in terms of multiple polylogarithms is to rationalize these square roots by a suitable variable change. We present a program that can be used to…

数学软件 · 计算机科学 2020-06-24 Marco Besier , Pascal Wasser , Stefan Weinzierl

Eigen-AD: Algorithmic Differentiation of the Eigen Library

In this work we present useful techniques and possible enhancements when applying an Algorithmic Differentiation (AD) tool to the linear algebra library Eigen using our in-house AD by overloading (AD-O) tool dco/c++ as a case study. After…

数学软件 · 计算机科学 2020-06-23 Patrick Peltzer , Johannes Lotz , Uwe Naumann

The DUNE Framework: Basic Concepts and Recent Developments

This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007…

数学软件 · 计算机科学 2020-06-23 Peter Bastian , Markus Blatt , Andreas Dedner , Nils-Arne Dreier , Christian Engwer , René Fritze , Carsten Gräser , Christoph Grüninger , Dominic Kempf , Robert Klöfkorn , Mario Ohlberger , Oliver Sander

Delayed approximate matrix assembly in multigrid with dynamic precisions

The accurate assembly of the system matrix is an important step in any code that solves partial differential equations on a mesh. We either explicitly set up a matrix, or we work in a matrix-free environment where we have to be able to…

数学软件 · 计算机科学 2020-06-19 Charles D. Murray , Tobias Weinzierl