Related papers: Accelerating eigenvector and pseudospectra computa…
Inverse iteration is known to be an effective method for computing eigenvectors corresponding to simple and well-separated eigenvalues. In the non-symmetric case, the solution of shifted Hessenberg systems is a central step. Existing…
Applications related to artificial intelligence, machine learning, and system identification simulations essentially use eigenvectors. Calculating eigenvectors for very large matrices using conventional methods is compute-intensive and…
We develop a distributed Block Chebyshev-Davidson algorithm to solve large-scale leading eigenvalue problems for spectral analysis in spectral clustering. First, the efficiency of the Chebyshev-Davidson algorithm relies on the prior…
In symmetric block eigenvalue algorithms, such as the subspace iteration algorithm and the locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm, a large block size is often employed to achieve robustness and rapid…
Basic Linear Algebra Subprograms (BLAS) are a set of low level linear algebra kernels widely adopted by applications involved with the deep learning and scientific computing. The massive and economic computing power brought forth by the…
We propose a simple technique that, if combined with algorithms for computing functions of triangular matrices, can make them more efficient. Basically, such a technique consists in a specific scaling similarity transformation that reduces…
Diagonalization of a large matrix is the computational bottleneck in many applications such as electronic structure calculations. We show that a speedup of over 30% can be achieved by exploiting 32-bit floating point operations, while…
Tensor accelerators have gained popularity because they provide a cheap and efficient solution for speeding up computational-expensive tasks in Deep Learning and, more recently, in other Scientific Computing applications. However, since…
Block-tridiagonal systems are prevalent in state estimation and optimal control, and solving these systems is often the computational bottleneck. Improving the underlying solvers therefore has a direct impact on the real-time performance of…
We propose an efficient algorithm for computing a common eigenvector of a finite set of square matrices. As an immediate consequence we obtain an algorithm for determining whether the matrices admit a simultaneous triangulation, and, if so,…
We develop and analyze new scheduling algorithms for solving sparse triangular linear systems (SpTRSV) in parallel. Our approach produces highly efficient synchronous schedules for the forward- and backward-substitution algorithm. Compared…
We display methods that allow for computations of spectra, pseudospectra and resolvents of linear operators on Hilbert spaces and also elements in unital Banach algebras. The paper considers two different approaches, namely, pseudospectral…
The solution of eigenproblems is often a key computational bottleneck that limits the tractable system size of numerical algorithms, among them electronic structure theory in chemistry and in condensed matter physics. Large eigenproblems…
A novel orthogonalization-free method together with two specific algorithms are proposed to solve extreme eigenvalue problems. On top of gradient-based algorithms, the proposed algorithms modify the multi-column gradient such that earlier…
Efficient solution of 3D elasticity problems is an important part of many industrial and scientific applications. Smoothed aggregation algebraic multigrid using rigid body modes for the tentative prolongation operator construction is an…
Spectral clustering approaches have led to well-accepted algorithms for finding accurate clusters in a given dataset. However, their application to large-scale datasets has been hindered by computational complexity of eigenvalue…
In this paper, we investigate GPU based parallel triangular solvers systematically. The parallel triangular solvers are fundamental to incomplete LU factorization family preconditioners and algebraic multigrid solvers. We develop a new…
In Density Functional Theory simulations based on the LAPW method, each self-consistent field cycle comprises dozens of large dense generalized eigenproblems. In contrast to real-space methods, eigenpairs solving for problems at distinct…
BLAS Level 3 operations are essential for scientific computing, but finding the optimal number of threads for multi-threaded implementations on modern multi-core systems is challenging. We present an extension to the Architecture and…
The work reported in this article presents a high-order, stable, and efficient Gegenbauer pseudospectral method to solve numerically a wide variety of mathematical models. The proposed numerical scheme exploits the stability and the…