English
Related papers

Related papers: Column Selection via Adaptive Sampling

200 papers

We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to…

Machine Learning · Statistics 2018-01-26 Yining Wang , Aarti Singh

Adaptive randomized pivoting (ARP) is a recently proposed and highly effective algorithm for column subset selection. This paper reinterprets the ARP algorithm by drawing connections to the volume sampling distribution and active learning…

Machine Learning · Statistics 2026-04-06 Ethan N. Epperly

The era of huge data necessitates highly efficient machine learning algorithms. Many common machine learning algorithms, however, rely on computationally intensive subroutines that are prohibitively expensive on large datasets. Oftentimes,…

Machine Learning · Computer Science 2023-09-26 Mo Tiwari

We consider the related tasks of matrix completion and matrix approximation from missing data and propose adaptive sampling procedures for both problems. We show that adaptive sampling allows one to eliminate standard incoherence…

Machine Learning · Statistics 2014-07-15 Akshay Krishnamurthy , Aarti Singh

Adaptive sampling is a useful algorithmic tool for data summarization problems in the classical centralized setting, where the entire dataset is available to the single processor performing the computation. Adaptive sampling repeatedly…

Data Structures and Algorithms · Computer Science 2020-04-24 Sepideh Mahabadi , Ilya Razenshteyn , David P. Woodruff , Samson Zhou

This paper examines the problem of locating outlier columns in a large, otherwise low-rank, matrix. We propose a simple two-step adaptive sensing and inference approach and establish theoretical guarantees for its performance; our results…

Information Theory · Computer Science 2015-06-22 Xingguo Li , Jarvis Haupt

We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces. We show that an adaptive sampling strategy for the random subspace significantly…

Optimization and Control · Mathematics 2019-12-19 Jonathan Lacotte , Mert Pilanci , Marco Pavone

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time…

Data Structures and Algorithms · Computer Science 2014-08-22 Michael B. Cohen , Yin Tat Lee , Cameron Musco , Christopher Musco , Richard Peng , Aaron Sidford

We derive a new adaptive leverage score sampling strategy for solving the Column Subset Selection Problem (CSSP). The resulting algorithm, called Adaptive Randomized Pivoting, can be viewed as a randomization of Osinsky's recently proposed…

Numerical Analysis · Mathematics 2025-06-23 Alice Cortinovis , Daniel Kressner

We study the problem of exact completion for $m \times n$ sized matrix of rank $r$ with the adaptive sampling method. We introduce a relation of the exact completion problem with the sparsest vector of column and row spaces (which we call…

Machine Learning · Computer Science 2022-03-08 Ilqar Ramazanli , Barnabas Poczos

The problem of column subset selection has recently attracted a large body of research, with feature selection serving as one obvious and important application. Among the techniques that have been applied to solve this problem, the greedy…

Data Structures and Algorithms · Computer Science 2021-11-16 Jason Altschuler , Aditya Bhaskara , Gang Fu , Vahab Mirrokni , Afshin Rostamizadeh , Morteza Zadimoghaddam

We present a matrix-factorization algorithm that scales to input matrices with both huge number of rows and columns. Learned factors may be sparse or dense and/or non-negative, which makes our algorithm suitable for dictionary learning,…

Machine Learning · Statistics 2017-11-15 Arthur Mensch , Julien Mairal , Bertrand Thirion , Gael Varoquaux

In the big data era researchers face a series of problems. Even standard approaches/methodologies, like linear regression, can be difficult or problematic with huge volumes of data. Traditional approaches for regression in big datasets may…

Methodology · Statistics 2024-11-13 Vasilis Chasiotis , Dimitris Karlis

The problem of column subset selection asks for a subset of columns from an input matrix such that the matrix can be reconstructed as accurately as possible within the span of the selected columns. A natural extension is to consider a…

Machine Learning · Computer Science 2024-08-13 Antonis Matakos , Bruno Ordozgoiti , Suhas Thejaswi

This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques,…

Numerical Analysis · Computer Science 2018-10-03 Shashanka Ubaru , Yousef Saad

Learning interpretable models has become a major focus of machine learning research, given the increasing prominence of machine learning in socially important decision-making. Among interpretable models, rule lists are among the best-known…

Machine Learning · Computer Science 2024-06-19 Leonardo Pellegrina , Fabio Vandin

Subset selection in multiple linear regression aims to choose a subset of candidate explanatory variables that tradeoff fitting error (explanatory power) and model complexity (number of variables selected). We build mathematical programming…

Machine Learning · Statistics 2020-09-04 Young Woong Park , Diego Klabjan

We propose a randomized method for solving linear programs with a large number of columns but a relatively small number of constraints. Since enumerating all the columns is usually unrealistic, such linear programs are commonly solved by…

Optimization and Control · Mathematics 2023-11-29 Yi-Chun Akchen , Velibor V. Mišić

The number of non-negative integer matrices with given row and column sums appears in a variety of problems in mathematics and statistics but no closed-form expression for it is known, so we rely on approximations of various kinds. Here we…

Computation · Statistics 2024-01-25 Maximilian Jerdee , Alec Kirkley , M. E. J. Newman

Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information…

Machine Learning · Computer Science 2023-01-18 Fred Lu , Edward Raff , James Holt
‹ Prev 1 2 3 10 Next ›