Related papers: Reverse iterative volume sampling for linear regre…

Leveraged volume sampling for linear regression

Suppose an $n \times d$ design matrix in a linear regression problem is given, but the response for each point is hidden unless explicitly requested. The goal is to sample only a small number $k \ll n$ of the responses, and then produce a…

Machine Learning · Computer Science 2018-09-06 Michał Dereziński , Manfred K. Warmuth , Daniel Hsu

Correcting the bias in least squares regression with volume-rescaled sampling

Consider linear regression where the examples are generated by an unknown distribution on $R^d\times R$. Without any assumptions on the noise, the linear least squares solution for any i.i.d. sample will typically be biased w.r.t. the least…

Machine Learning · Computer Science 2018-10-08 Michał Dereziński , Manfred K. Warmuth , Daniel Hsu

Unbiased estimates for linear regression via volume sampling

Given a full rank matrix $X$ with more columns than rows, consider the task of estimating the pseudo inverse $X^+$ based on the pseudo inverse of a sampled subset of columns (of size at least the number of rows). We show that this is…

Machine Learning · Computer Science 2018-06-07 Michał Dereziński , Manfred K. Warmuth

Unbiased estimators for random design regression

In linear regression we wish to estimate the optimum linear least squares predictor for a distribution over $d$-dimensional input points and real-valued responses, based on a small sample. Under standard random design analysis, where the…

Machine Learning · Statistics 2022-06-08 Michał Dereziński , Manfred K. Warmuth , Daniel Hsu

A statistical perspective of sampling scores for linear regression

In this paper, we consider a statistical problem of learning a linear model from noisy samples. Existing work has focused on approximating the least squares solution by using leverage-based scores as an importance sampling distribution.…

Machine Learning · Statistics 2016-02-11 Siheng Chen , Rohan Varma , Aarti Singh , Jelena Kovačević

Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design

We study the optimal design problems where the goal is to choose a set of linear measurements to obtain the most accurate estimate of an unknown vector in $d$ dimensions. We study the $A$-optimal design variant where the objective is to…

Data Structures and Algorithms · Computer Science 2018-07-18 Aleksandar Nikolov , Mohit Singh , Uthaipon Tao Tantipongpipat

Minimax experimental design: Bridging the gap between statistical and worst-case approaches to least squares regression

In experimental design, we are given a large collection of vectors, each with a hidden response value that we assume derives from an underlying linear model, and we wish to pick a small subset of the vectors such that querying the…

Machine Learning · Computer Science 2019-02-05 Michał Dereziński , Kenneth L. Clarkson , Michael W. Mahoney , Manfred K. Warmuth

Least squares approximations in linear statistical inverse learning problems

Statistical inverse learning aims at recovering an unknown function $f$ from randomly scattered and possibly noisy point evaluations of another function $g$, connected to $f$ via an ill-posed mathematical model. In this paper we blend…

Statistics Theory · Mathematics 2024-01-22 Tapio Helin

Efficient Algorithms for Outlier-Robust Regression

We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from…

Machine Learning · Computer Science 2020-06-05 Adam Klivans , Pravesh K. Kothari , Raghu Meka

Optimal weighted least-squares methods

We consider the problem of reconstructing an unknown bounded function $u$ defined on a domain $X\subset \mathbb{R}^d$ from noiseless or noisy samples of $u$ at $n$ points $(x^i)_{i=1,\dots,n}$. We measure the reconstruction error in a norm…

Numerical Analysis · Mathematics 2016-08-02 Albert Cohen , Giovanni Migliorati

Independent Range Sampling, Revisited Again

We revisit the range sampling problem: the input is a set of points where each point is associated with a real-valued weight. The goal is to store them in a structure such that given a query range and an integer $k$, we can extract $k$…

Data Structures and Algorithms · Computer Science 2019-03-20 Peyman Afshani , Jeff M. Phillips

Uniform Sampling for Matrix Approximation

Random sampling has become a critical tool in solving massive matrix problems. For linear regression, a small, manageable set of data rows can be randomly selected to approximate a tall, skinny data matrix, improving processing time…

Data Structures and Algorithms · Computer Science 2014-08-22 Michael B. Cohen , Yin Tat Lee , Cameron Musco , Christopher Musco , Richard Peng , Aaron Sidford

Optimal sampling for least squares approximation with general dictionaries

We consider the problem of approximating an unknown function from point evaluations. This problem is a crucial subproblem in many modern (nonlinear) approximation schemes. When obtaining these point evaluations is costly, minimising the…

Numerical Analysis · Mathematics 2025-12-03 Philipp Trunschke , Anthony Nouy

Efficient volume sampling for row/column subset selection

We give efficient algorithms for volume sampling, i.e., for picking $k$-subsets of the rows of any given matrix with probabilities proportional to the squared volumes of the simplices defined by them and the origin (or the squared volumes…

Data Structures and Algorithms · Computer Science 2010-04-26 Amit Deshpande , Luis Rademacher

Gradient-based Sampling: An Adaptive Importance Sampling for Least-squares

In modern data analysis, random sampling is an efficient and widely-used strategy to overcome the computational difficulties brought by large sample size. In previous studies, researchers conducted random sampling which is according to the…

Machine Learning · Statistics 2018-03-05 Rong Zhu

Sample Efficient Linear Meta-Learning by Alternating Minimization

Meta-learning synthesizes and leverages the knowledge from a given set of tasks to rapidly learn new tasks using very little data. Meta-learning of linear regression tasks, where the regressors lie in a low-dimensional subspace, is an…

Machine Learning · Computer Science 2021-05-19 Kiran Koshy Thekumparampil , Prateek Jain , Praneeth Netrapalli , Sewoong Oh

Learning Adaptive Sampling and Reconstruction for Volume Visualization

A central challenge in data visualization is to understand which data samples are required to generate an image of a data set in which the relevant information is encoded. In this work, we make a first step towards answering the question of…

Graphics · Computer Science 2021-03-12 Sebastian Weiss , Mustafa Işık , Justus Thies , Rüdiger Westermann

Least Squares Wavelet-based Estimation for Additive Regression Models using Non Equally-Spaced Designs

Additive regression models are actively researched in the statistical field because of their usefulness in the analysis of responses determined by non-linear relationships with multivariate predictors. In this kind of statistical models,…

Methodology · Statistics 2018-04-10 German A. Schnaidt Grez , Brani Vidakovic

Weighted least squares methods for prediction in the functional data linear model

The problem of prediction in functional linear regression is conventionally addressed by reducing dimension via the standard principal component basis. In this paper we show that an alternative basis chosen through weighted least-squares,…

Methodology · Statistics 2009-02-20 Aurore Delaigle , Peter Hall , Tatiyana V. Apanasovich

Deep Partial Least Squares for Instrumental Variable Regression

In this paper, we propose deep partial least squares for the estimation of high-dimensional nonlinear instrumental variable regression. As a precursor to a flexible deep neural network architecture, our methodology uses partial least…

Methodology · Statistics 2023-06-06 Maria Nareklishvili , Nicholas Polson , Vadim Sokolov