English

Randomized algorithms for matrices and data

Data Structures and Algorithms 2011-11-16 v3

Abstract

Randomized algorithms for very large matrix problems have received a great deal of attention in recent years. Much of this work was motivated by problems in large-scale data analysis, and this work was performed by individuals from many different research communities. This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis. An emphasis will be placed on a few simple core ideas that underlie not only recent theoretical advances but also the usefulness of these tools in large-scale data applications. Crucial in this context is the connection with the concept of statistical leverage. This concept has long been used in statistical regression diagnostics to identify outliers; and it has recently proved crucial in the development of improved worst-case matrix algorithms that are also amenable to high-quality numerical implementation and that are useful to domain scientists. Randomized methods solve problems such as the linear least-squares problem and the low-rank matrix approximation problem by constructing and operating on a randomized sketch of the input matrix. Depending on the specifics of the situation, when compared with the best previously-existing deterministic algorithms, the resulting randomized algorithms have worst-case running time that is asymptotically faster; their numerical implementations are faster in terms of clock-time; or they can be implemented in parallel computing environments where existing numerical algorithms fail to run at all. Numerous examples illustrating these observations will be described in detail.

Keywords

Cite

@article{arxiv.1104.5557,
  title  = {Randomized algorithms for matrices and data},
  author = {Michael W. Mahoney},
  journal= {arXiv preprint arXiv:1104.5557},
  year   = {2011}
}

Comments

Review article, 54 pages, 198 references. Version appearing as a monograph in Now Publishers' "Foundations and Trends in Machine Learning" series

R2 v1 2026-06-21T18:00:15.050Z