English

Relative-Error CUR Matrix Decompositions

Data Structures and Algorithms 2007-08-29 v1

Abstract

Many data analysis applications deal with large matrices and involve approximating the matrix using a small number of ``components.'' Typically, these components are linear combinations of the rows and columns of the matrix, and are thus difficult to interpret in terms of the original features of the input data. In this paper, we propose and study matrix approximations that are explicitly expressed in terms of a small number of columns and/or rows of the data matrix, and thereby more amenable to interpretation in terms of the original data. Our main algorithmic results are two randomized algorithms which take as input an m×nm \times n matrix AA and a rank parameter kk. In our first algorithm, CC is chosen, and we let A=CC+AA'=CC^+A, where C+C^+ is the Moore-Penrose generalized inverse of CC. In our second algorithm CC, UU, RR are chosen, and we let A=CURA'=CUR. (CC and RR are matrices that consist of actual columns and rows, respectively, of AA, and UU is a generalized inverse of their intersection.) For each algorithm, we show that with probability at least 1δ1-\delta: AAF(1+ϵ)AAkF, ||A-A'||_F \leq (1+\epsilon) ||A-A_k||_F, where AkA_k is the ``best'' rank-kk approximation provided by truncating the singular value decomposition (SVD) of AA. The number of columns of CC and rows of RR is a low-degree polynomial in kk, 1/ϵ1/\epsilon, and log(1/δ)\log(1/\delta). Our two algorithms are the first polynomial time algorithms for such low-rank matrix approximations that come with relative-error guarantees; previously, in some cases, it was not even known whether such matrix decompositions exist. Both of our algorithms are simple, they take time of the order needed to approximately compute the top kk singular vectors of AA, and they use a novel, intuitive sampling method called ``subspace sampling.''

Keywords

Cite

@article{arxiv.0708.3696,
  title  = {Relative-Error CUR Matrix Decompositions},
  author = {Petros Drineas and Michael W. Mahoney and S. Muthukrishnan},
  journal= {arXiv preprint arXiv:0708.3696},
  year   = {2007}
}

Comments

40 pages, 10 figures

R2 v1 2026-06-21T09:11:13.216Z