Related papers: A parallel structured divide-and-conquer algorithm…
In this paper, an efficient divide-and-conquer (DC) algorithm is proposed for the symmetric tridiagonal matrices based on ScaLAPACK and the hierarchically semiseparable (HSS) matrices. HSS is an important type of rank-structured…
In this paper, two accelerated divide-and-conquer algorithms are proposed for the symmetric tridiagonal eigenvalue problem, which cost $O(N^2r)$ {flops} in the worst case, where $N$ is the dimension of the matrix and $r$ is a modest number…
In this paper, we present a generalized Cuppen's divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. We extend the Cuppen's work to the rank two modifications of the form $A =T +\beta_1\bw_1\bw_1^T +…
Divide and Conquer (D&C) is a widely used algorithmic strategy for symmetric eigenvalue decomposition. Its natural parallelism makes D&C attractive on modern multicore CPUs and GPUs, but existing eigenvalue-only routines often default to…
In this paper, a Parallel Direct Eigensolver for Sequences of Hermitian Eigenvalue Problems with no tridiagonalization is proposed, denoted by \texttt{PDESHEP}, and it combines direct methods with iterative methods. \texttt{PDESHEP} first…
For dense Hermitian matrices with small off-diagonal (numerical) ranks and in a hierarchically semiseparable form, we give a stable divide-and-conquer eigendecomposition method with nearly linear complexity (called SuperDC) that…
A parallel algorithm for solving a series of matrix equations with a constant tridiagonal matrix and different right-hand sides is proposed and studied. The process of solving the problem is represented in two steps. The first preliminary…
Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and…
The growing size of modern data sets brings many challenges to the existing statistical estimation approaches, which calls for new distributed methodologies. This paper studies distributed estimation for a fundamental statistical machine…
In this paper we propose a parallel coordinate descent algorithm for solving smooth convex optimization problems with separable constraints that may arise e.g. in distributed model predictive control (MPC) for linear network systems. Our…
Divide-and-conquer-based (DC-based) evolutionary algorithms (EAs) have achieved notable success in dealing with large-scale optimization problems (LSOPs). However, the appealing performance of this type of algorithms generally requires a…
Based on the spectral divide-and-conquer algorithm by Nakatsukasa and Higham [SIAM J. Sci. Comput., 35(3): A1325-A1349, 2013], we propose a new algorithm for computing all the eigenvalues and eigenvectors of a symmetric banded matrix. For…
We propose a novel class of Sequential Monte Carlo (SMC) algorithms, appropriate for inference in probabilistic graphical models. This class of algorithms adopts a divide-and-conquer approach based upon an auxiliary tree-structured…
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. One crucial bottleneck limiting its usage is the expensive computation cost, particularly for a mini-batch of matrices in deep neural networks. Our…
In this paper we consider the problem of identifying intersections between two sets of d-dimensional axis-parallel rectangles. This is a common problem that arises in many agent-based simulation studies, and is of central importance in the…
Many large-scale scientific computations require eigenvalue solvers in a scaling regime where efficiency is limited by data movement. We introduce a parallel algorithm for computing the eigenvalues of a dense symmetric matrix, which…
Optimally hybrid numerical solvers were constructed for massively parallel generalized eigenvalue problem (GEP).The strong scaling benchmark was carried out on the K computer and other supercomputers for electronic structure calculation…
The acceleration of sparse matrix computations on modern many-core processors, such as the graphics processing units (GPUs), has been recognized and studied over a decade. Significant performance enhancements have been achieved for many…
A tridiagonal matrix algorithm (TDMA), Pipelined-TDMA, is developed for multi-GPU systems to resolve the scalability bottlenecks caused by the sequential structure of conventional divide-and-conquer TDMA. The proposed method pipelines…
The advent of edge computing has enabled resource-constrained clients to delegate intensive computational tasks to distributed edge servers, especially within Internet of Things (IoT) environments. Among such tasks, Matrix Determinant…