English

A 3D Parallel Algorithm for QR Decomposition

Distributed, Parallel, and Cluster Computing 2018-05-15 v1

Abstract

Interprocessor communication often dominates the runtime of large matrix computations. We present a parallel algorithm for computing QR decompositions whose bandwidth cost (communication volume) can be decreased at the cost of increasing its latency cost (number of messages). By varying a parameter to navigate the bandwidth/latency tradeoff, we can tune this algorithm for machines with different communication costs.

Keywords

Cite

@article{arxiv.1805.05278,
  title  = {A 3D Parallel Algorithm for QR Decomposition},
  author = {Grey Ballard and James Demmel and Laura Grigori and Mathias Jacquelin and Nicholas Knight},
  journal= {arXiv preprint arXiv:1805.05278},
  year   = {2018}
}