Randomized algorithms for matrices and data

Michael W. Mahoney

Randomized algorithms for matrices and data

Data Structures and Algorithms 2011-11-16 v3

Authors: Michael W. Mahoney

Abstract

Randomized algorithms for very large matrix problems have received a great deal of attention in recent years. Much of this work was motivated by problems in large-scale data analysis, and this work was performed by individuals from many different research communities. This monograph will provide a detailed overview of recent work on the theory of randomized matrix algorithms as well as the application of those ideas to the solution of practical problems in large-scale data analysis. An emphasis will be placed on a few simple core ideas that underlie not only recent theoretical advances but also the usefulness of these tools in large-scale data applications. Crucial in this context is the connection with the concept of statistical leverage. This concept has long been used in statistical regression diagnostics to identify outliers; and it has recently proved crucial in the development of improved worst-case matrix algorithms that are also amenable to high-quality numerical implementation and that are useful to domain scientists. Randomized methods solve problems such as the linear least-squares problem and the low-rank matrix approximation problem by constructing and operating on a randomized sketch of the input matrix. Depending on the specifics of the situation, when compared with the best previously-existing deterministic algorithms, the resulting randomized algorithms have worst-case running time that is asymptotically faster; their numerical implementations are faster in terms of clock-time; or they can be implemented in parallel computing environments where existing numerical algorithms fail to run at all. Numerous examples illustrating these observations will be described in detail.

Keywords

optimization algorithm algorithm selection randomized algorithm

Cite

@article{arxiv.1104.5557,
  title  = {Randomized algorithms for matrices and data},
  author = {Michael W. Mahoney},
  journal= {arXiv preprint arXiv:1104.5557},
  year   = {2011}
}

Comments

Review article, 54 pages, 198 references. Version appearing as a monograph in Now Publishers' "Foundations and Trends in Machine Learning" series

Randomized algorithms for matrices and data

Abstract

Keywords

Cite

Comments

Related papers