Related papers: Sampling Algorithms and Coresets for Lp Regression
We give improved algorithms for the $\ell_{p}$-regression problem, $\min_{x} \|x\|_{p}$ such that $A x=b,$ for all $p \in (1,2) \cup (2,\infty).$ Our algorithms obtain a high accuracy solution in $\tilde{O}_{p}(m^{\frac{|p-2|}{2p + |p-2|}})…
In this paper we obtain improved iteration complexities for solving $\ell_p$ regression. We provide methods which given any full-rank $\mathbf{A} \in \mathbb{R}^{n \times d}$ with $n \geq d$, $b \in \mathbb{R}^n$, and $p \geq 2$ solve…
A coreset of a dataset with $n$ examples and $d$ features is a weighted subset of examples that is sufficient for solving downstream data analytic tasks. Nearly optimal constructions of coresets for least squares and $\ell_p$ linear…
We study the $\ell_p$ regression problem, which requires finding $\mathbf{x}\in\mathbb R^{d}$ that minimizes $\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p$ for a matrix $\mathbf{A}\in\mathbb R^{n \times d}$ and response vector…
The $\ell_p$-norm regression problem is a classic problem in optimization with wide ranging applications in machine learning and theoretical computer science. The goal is to compute $x^{\star} =\arg\min_{Ax=b}\|x\|_p^p$, where $x^{\star}\in…
We study $L_p$ polynomial regression. Given query access to a function $f:[-1,1] \rightarrow \mathbb{R}$, the goal is to find a degree $d$ polynomial $\hat{q}$ such that, for a given parameter $\varepsilon > 0$, $$ \|\hat{q}-f\|_p\le…
In real world, our datasets often contain outliers. Moreover, the outliers can seriously affect the final machine learning result. Most existing algorithms for handling outliers take high time complexities (e.g. quadratic or cubic…
For a matrix $A\in \mathbb{R}^{n\times d}$ with $n\geq d$, we consider the dual problems of $\min \|Ax-b\|_p^p, \, b\in \mathbb{R}^n$ and $\min_{A^\top x=b} \|x\|_p^p,\, b\in \mathbb{R}^d$. We improve the runtimes for solving these problems…
We provide fast algorithms for overconstrained $\ell_p$ regression and related problems: for an $n\times d$ input matrix $A$ and vector $b\in\mathbb{R}^n$, in $O(nd\log n)$ time we reduce the problem $\min_{x\in\mathbb{R}^d} \|Ax-b\|_p$ to…
We study active sampling algorithms for linear regression, which aim to query only a few entries of a target vector $b\in\mathbb R^n$ and output a near minimizer to $\min_{x\in\mathbb R^d} \|Ax-b\|$, for a design matrix $A\in\mathbb R^{n…
The seminal work of Cohen and Peng introduced Lewis weight sampling to the theoretical computer science community, yielding fast row sampling algorithms for approximating $d$-dimensional subspaces of $\ell_p$ up to $(1+\epsilon)$ error.…
We study the low rank approximation problem of any given matrix $A$ over $\mathbb{R}^{n\times m}$ and $\mathbb{C}^{n\times m}$ in entry-wise $\ell_p$ loss, that is, finding a rank-$k$ matrix $X$ such that $\|A-X\|_p$ is minimized. Unlike…
Accurate coresets are a weighted subset of the original dataset, ensuring a model trained on the accurate coreset maintains the same level of accuracy as a model trained on the full dataset. Primarily, these coresets have been studied for a…
There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n*d matrix where n >> d, which…
The sparse regression problem, also known as best subset selection problem, can be cast as follows: Given a set $S$ of $n$ points in $\mathbb{R}^d$, a point $y\in \mathbb{R}^d$, and an integer $2 \leq k \leq d$, find an affine combination…
The $\ell_p$ subspace approximation problem is an NP-hard low rank approximation problem that generalizes the median hyperplane ($p = 1$), principal component analysis ($p = 2$), and center hyperplane problems ($p = \infty$). A popular…
A growing number of problems in computational mathematics can be reduced to the solution of many linear systems that are related, often depending smoothly or slowly on a parameter $p$, that is, $A(p)x(p)=b(p)$. We introduce an efficient…
This paper introduces the problem of coresets for regression problems to panel data settings. We first define coresets for several variants of regression problems with panel data and then present efficient algorithms to construct coresets…
Linear regression in $\ell_p$-norm is a canonical optimization problem that arises in several applications, including sparse recovery, semi-supervised learning, and signal processing. Generic convex optimization algorithms for solving…
Subsampling algorithms are a natural approach to reduce data size before fitting models on massive datasets. In recent years, several works have proposed methods for subsampling rows from a data matrix while maintaining relevant information…