English
Related papers

Related papers: Sample-Efficient Linear Regression with Self-Selec…

200 papers

In the classical setting of self-selection, the goal is to learn $k$ models, simultaneously from observations $(x^{(i)}, y^{(i)})$ where $y^{(i)}$ is the output of one of $k$ underlying models on input $x^{(i)}$. In contrast to mixture…

Statistics Theory · Mathematics 2022-12-13 Yeshwanth Cherapanamjeri , Constantinos Daskalakis , Andrew Ilyas , Manolis Zampetakis

We revisit the problem of estimating $k$ linear regressors with self-selection bias in $d$ dimensions with the maximum selection criterion, as introduced by Cherapanamjeri, Daskalakis, Ilyas, and Zampetakis [CDIZ23, STOC'23]. Our main…

Machine Learning · Statistics 2025-04-11 Alkis Kalavasis , Anay Mehrotra , Felix Zhou

We obtain robust and computationally efficient estimators for learning several linear models that achieve statistically optimal convergence rate under minimal distributional assumptions. Concretely, we assume our data is drawn from a…

Machine Learning · Statistics 2020-12-07 Ainesh Bakshi , Adarsh Prasad

We give the first polynomial-time algorithm for performing linear or polynomial regression resilient to adversarial corruptions in both examples and labels. Given a sufficiently large (polynomial-size) training set drawn i.i.d. from…

Machine Learning · Computer Science 2020-06-05 Adam Klivans , Pravesh K. Kothari , Raghu Meka

We propose a new estimator for the high-dimensional linear regression model with observation error in the design where the number of coefficients is potentially larger than the sample size. The main novelty of our procedure is that the…

Methodology · Statistics 2019-09-09 Alexandre Belloni , Abhishek Kaul , Mathieu Rosenbaum

Sparse linear regression is a central problem in high-dimensional statistics. We study the correlated random design setting, where the covariates are drawn from a multivariate Gaussian $N(0,\Sigma)$, and we seek an estimator with small…

Data Structures and Algorithms · Computer Science 2023-05-29 Jonathan Kelner , Frederic Koehler , Raghu Meka , Dhruv Rohatgi

We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $\alpha < 1$, our algorithm takes as input a sample…

Data Structures and Algorithms · Computer Science 2019-05-31 Sushrut Karmalkar , Adam R. Klivans , Pravesh K. Kothari

We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = \langle X,w^* \rangle + \epsilon$ (with $X \in \mathbb{R}^d$ and $\epsilon$ independent), in…

We study efficient algorithms for linear regression and covariance estimation in the absence of Gaussian assumptions on the underlying distributions of samples, making assumptions instead about only finitely-many moments. We focus on how…

In this paper, we study the problem of online sparse linear regression (OSLR) where the algorithms are restricted to accessing only $k$ out of $d$ attributes per instance for prediction, which was proved to be NP-hard. Previous work gave…

Machine Learning · Computer Science 2025-11-03 Junfan Li , Shizhong Liao , Zenglin Xu , Liqiang Nie

We consider the problem of finding an approximate solution to $\ell_1$ regression while only observing a small number of labels. Given an $n \times d$ unlabeled data matrix $X$, we must choose a small set of $m \ll n$ rows to observe the…

Machine Learning · Computer Science 2021-05-21 Aditya Parulekar , Advait Parulekar , Eric Price

We study the classical problem of predicting an outcome variable, $Y$, using a linear combination of a $d$-dimensional covariate vector, $\mathbf{X}$. We are interested in linear predictors whose coefficients solve: % \begin{align*}…

Statistics Theory · Mathematics 2024-04-10 José Luis Montiel Olea , Cynthia Rush , Amilcar Velez , Johannes Wiesel

Sparse linear regression is one of the most basic questions in machine learning and statistics. Here, we are given as input a design matrix $X \in \mathbb{R}^{N \times d}$ and measurements or labels ${y} \in \mathbb{R}^N$ where ${y} = {X}…

Machine Learning · Computer Science 2025-11-11 Gautam Chandrasekaran , Raghu Meka , Konstantinos Stavropoulos

We study the problem of high-dimensional linear regression in a robust model where an $\epsilon$-fraction of the samples can be adversarially corrupted. We focus on the fundamental setting where the covariates of the uncorrupted samples are…

Machine Learning · Computer Science 2018-06-04 Ilias Diakonikolas , Weihao Kong , Alistair Stewart

We study the task of noiseless linear regression under Gaussian covariates in the presence of additive oblivious contamination. Specifically, we are given i.i.d.\ samples from a distribution $(x, y)$ on $\mathbb{R}^d \times \mathbb{R}$ with…

Data Structures and Algorithms · Computer Science 2025-10-14 Ilias Diakonikolas , Chao Gao , Daniel M. Kane , John Lafferty , Ankit Pensia

In this paper we discuss the variable selection method from \ell0-norm constrained regression, which is equivalent to the problem of finding the best subset of a fixed size. Our study focuses on two aspects, consistency and computation. We…

Methodology · Statistics 2013-03-20 Shifeng Xiong

We study the estimation of distributional parameters when samples are shown only if they fall in some unknown set $S \subseteq \mathbb{R}^d$. Kontonis, Tzamos, and Zampetakis (FOCS'19) gave a $d^{\mathrm{poly}(1/\varepsilon)}$ time…

Statistics Theory · Mathematics 2026-05-12 Jane H. Lee , Anay Mehrotra , Manolis Zampetakis

A significant hurdle for analyzing large sample data is the lack of effective statistical computing and inference methods. An emerging powerful approach for analyzing large sample data is subsampling, by which one takes a random subsample…

Methodology · Statistics 2015-11-24 Rong Zhu , Ping Ma , Michael W. Mahoney , Bin Yu

Linear regression studies the problem of estimating a model parameter $\beta^* \in \mathbb{R}^p$, from $n$ observations $\{(y_i,\mathbf{x}_i)\}_{i=1}^n$ from linear model $y_i = \langle \mathbf{x}_i,\beta^* \rangle + \epsilon_i$. We…

Machine Learning · Statistics 2015-05-14 Xinyang Yi , Zhaoran Wang , Constantine Caramanis , Han Liu

This work is a re-examination of the sparse Bayesian learning (SBL) of linear regression models of Tipping (2001) in a high-dimensional setting. We propose a hard-thresholded version of the SBL estimator that achieves, for orthogonal design…

Methodology · Statistics 2015-02-12 Yves Atchade , Chia Chye Yee
‹ Prev 1 2 3 10 Next ›