Related papers: A Nested Krylov Method Using Half-Precision Arithm…
Krylov methods provide a fast and highly parallel numerical tool for the iterative solution of many large-scale sparse linear systems. To a large extent, the performance of practical realizations of these methods is constrained by the…
With the emergence of mixed precision capabilities in hardware, iterative refinement schemes for solving linear systems $Ax=b$ have recently been revisited and reanalyzed in the context of three or more precisions. These new analyses show…
Sparse matrices have recently played a significant and impactful role in scientific computing, including artificial intelligence-related fields. According to historical studies on sparse matrix--vector multiplication (SpMV), Krylov subspace…
Low precision arithmetic, in particular half precision floating point arithmetic, is now available in commercial hardware. Using lower precision can offer significant savings in computation and communication costs with proportional savings…
With the hardware support for half-precision arithmetic on NVIDIA V100 GPUs, high-performance computing applications can benefit from lower precision at appropriate spots to speed up the overall execution time. In this paper, we investigate…
This study presents a novel mixed-precision iterative refinement algorithm, GADI-IR, within the general alternating-direction implicit (GADI) framework, designed for efficiently solving large-scale sparse linear systems. By employing…
One of the most computationally expensive steps of the low-rank ADI method for large-scale Lyapunov equations is the solution of a shifted linear system at each iteration. We propose the use of the extended Krylov subspace method for this…
The GMRES method is used to solve sparse, non-symmetric systems of linear equations arising from many scientific applications. The solver performance within a single node is memory bound, due to the low arithmetic intensity of its…
Since numbers in the computer are represented with a fixed number of bits, loss of accuracy during calculation is unavoidable. At high precision where more bits (e.g. 64) are allocated to each number, round-off errors are typically small.…
Nowadays, many fields of study are have to deal with large and sparse data matrixes, but the most important issue is finding the inverse of these matrixes. Thankfully, Krylov subspace methods can be used in solving these types of problem.…
Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems…
In the last decade, tensors have shown their potential as valuable tools for various tasks in numerical linear algebra. While most of the research has been focusing on how to compress a given tensor in order to maintain information as well…
The emergence of low precision floating-point arithmetic in computer hardware has led to a resurgence of interest in the use of mixed precision numerical linear algebra. For linear systems of equations, there has been renewed enthusiasm for…
Several Krylov-type procedures are introduced that generalize matrix Krylov methods for tensor computations. They are denoted minimal Krylov recursion, maximal Krylov recursion, contracted tensor product Krylov recursion. It is proved that…
We present variants of the Conjugate Gradient (CG), Conjugate Residual (CR), and Generalized Minimal Residual (GMRES) methods which are both pipelined and flexible. These allow computation of inner products and norms to be overlapped with…
Support for lower precision computation is becoming more common in accelerator hardware due to lower power usage, reduced data movement and increased computational performance. However, computational science and engineering (CSE) problems…
We investigate the use of half-precision floating-point numbers (FP16) in mixed-precision linear solvers for lattice QCD simulations. Since the emergence of GPUs for general-purpose, mixed-precision algorithms that combine single-precision…
Parallel implementations of Krylov subspace methods often help to accelerate the procedure of finding an approximate solution of a linear system. However, such parallelization coupled with asynchronous and out-of-order execution often…
In recent years, the fervent demand for computational power across various domains has prompted hardware manufacturers to introduce specialized computing hardware aimed at enhancing computational capabilities. Particularly, the utilization…
Pipelined Krylov methods seek to ameliorate the latency due to inner products necessary for projection by overlapping it with the computation associated with sparse matrix-vector multiplication. We clarify a folk theorem that this can only…