Mathematical Software
KBLAS is a new open source high performance library that provides optimized kernels for a subset of Level 2 BLAS functionalities on CUDA-enabled GPUs. Since performance of dense matrix-vector multiplication is hindered by the overhead of…
In a series of papers it has been shown that for many linear algebra operations it is possible to generate families of algorithms by following a systematic procedure. Although powerful, such a methodology involves complex algebraic…
In recent years it has been shown that for many linear algebra operations it is possible to create families of algorithms following a very systematic procedure. We do not refer to the fine tuning of a known algorithm, but to a methodology…
Tensor operations are surging as the computational building blocks for a variety of scientific simulations and the development of high-performance kernels for such operations is known to be a challenging task. While for operations on one-…
We present a novel finite element integration method for low order elements on GPUs. We achieve more than 100GF for element integration on first order discretizations of both the Laplacian and Elasticity operators.
The Isabelle proof assistant comes equipped with a very powerful tactic for term simplification. While tremendously useful, the results of simplifying a term do not always match the user's expectation: sometimes, the resulting term is not…
Emergence of stochastic simulations as an extensively used computational tool for scientific purposes intensified the need for more accurate ways of generating sufficiently long sequences of uncorrelated random numbers. Even though several…
The paper demonstrates the optimization of the execution environment of a hybrid OpenMP+MPI computational fluid dynamics code (shallow water equation solver) on a cluster enabled with Intel Xeon Phi coprocessors. The discussion includes:…
Various fields of science and engineering rely on linear algebra for large scale data analysis, modeling and simulation, machine learning, and other applied problems. Linear algebra computations often dominate the execution time of such…
The spheroidal wave functions, which are the solutions to the Helmholtz equation in spheroidal coordinates, are notoriously difficult to compute. Because of this, practically no programming language comes equipped with the means to compute…
scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications. It is released under the liberal "Modified BSD" open source license, provides a well-documented…
Conversion between binary and decimal floating-point representations is ubiquitous. Floating-point radix conversion means converting both the exponent and the mantissa. We develop an atomic operation for FP radix conversion with simple…
Modular integer arithmetic occurs in many algorithms for computer algebra, cryptography, and error correcting codes. Although recent microprocessors typically offer a wide range of highly optimized arithmetic functions, modular integer…
We describe in this paper new design techniques used in the \cpp exact linear algebra library \linbox, intended to make the library safer and easier to use, while keeping it generic and efficient. First, we review the new simplified…
In many scientific applications the solution of non-linear differential equations are obtained through the set-up and solution of a number of successive eigenproblems. These eigenproblems can be regarded as a sequence whenever the solution…
ViDaExpert is a tool for visualization and analysis of multidimensional vectorial data. ViDaExpert is able to work with data tables of "object-feature" type that might contain numerical feature values as well as textual labels for rows…
The paper describes a SageTeX implementation of an integer encoding procedures.
Pseudo-spectral method is one of the most accurate techniques for simulating turbulent flows. Fast Fourier transform (FFT) is an integral part of this method. In this paper, we present a new procedure to compute FFT in which we save…
Many problems in computational science and engineering involve partial differential equations and thus require the numerical solution of large, sparse (non)linear systems of equations. Multigrid is known to be one of the most efficient…
Python implementation of permutations is presented. Three classes are introduced: Perm for permutations, Group for permutation groups, and PermError to report any errors for both classes. The class Perm is based on Python dictionaries and…