English
Related papers

Related papers: Efficient List-Decodable Regression using Batches

200 papers

We study the task of list-decodable linear regression using batches. A batch is called clean if it consists of i.i.d. samples from an unknown linear regression distribution. For a parameter $\alpha \in (0, 1/2)$, an unknown…

Machine Learning · Computer Science 2025-03-14 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Sihan Liu , Thanasis Pittas

We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $\alpha < 1$, our algorithm takes as input a sample…

Data Structures and Algorithms · Computer Science 2019-05-31 Sushrut Karmalkar , Adam R. Klivans , Pravesh K. Kothari

Traditionally, robust statistics has focused on designing estimators tolerant to a minority of contaminated data. Robust list-decodable learning focuses on the more challenging regime where only a minority $\frac 1 k$ fraction of the…

Data Structures and Algorithms · Computer Science 2020-11-20 Ilias Diakonikolas , Daniel M. Kane , Daniel Kongsgaard , Jerry Li , Kevin Tian

Robust mean estimation is one of the most important problems in statistics: given a set of samples in $\mathbb{R}^d$ where an $\alpha$ fraction are drawn from some distribution $D$ and the rest are adversarially corrupted, we aim to…

Machine Learning · Computer Science 2022-12-07 Shiwei Zeng , Jie Shen

We introduce an expander-sketching framework for list-decodable linear regression that achieves sample complexity $\tilde{O}((d+\log(1/\delta))/\alpha)$, list size $O(1/\alpha)$, and near input-sparsity running time…

Machine Learning · Computer Science 2025-12-01 Herbod Pourali , Sajjad Hashemian , Ebrahim Ardeshir-Larijani

We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. Specifically, we are given a set $T$ of $n$ points in $\mathbb{R}^d$ and a parameter $0< \alpha <\frac 1 2$ such that an…

Data Structures and Algorithms · Computer Science 2021-11-15 Ilias Diakonikolas , Daniel M. Kane , Daniel Kongsgaard , Jerry Li , Kevin Tian

In list-decodable learning, we are given a set of data points such that an $\alpha$-fraction of these points come from a nice distribution $D$, for some small $\alpha \ll 1$, and the goal is to output a short list of candidate solutions,…

Machine Learning · Computer Science 2025-11-25 Ziyun Chen , Spencer Compton , Daniel Kane , Jerry Li

In the list-decodable learning setup, an overwhelming majority (say a $1-\beta$-fraction) of the input data consists of outliers and the goal of an algorithm is to output a small list $\mathcal{L}$ of hypotheses such that one of them agrees…

Data Structures and Algorithms · Computer Science 2019-05-14 Prasad Raghavendra , Morris Yau

In list-decodable subspace recovery, the input is a collection of $n$ points $\alpha n$ (for some $\alpha \ll 1/2$) of which are drawn i.i.d. from a distribution $\mathcal{D}$ with a isotropic rank $r$ covariance $\Pi_*$ (the…

Data Structures and Algorithms · Computer Science 2021-01-08 Ainesh Bakshi , Pravesh K. Kothari

We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter $\alpha \in (0, 1/2)$, we are given $m$ points in $\mathbb{R}^n$, $\lfloor \alpha m \rfloor$ of which are i.i.d. samples from a distribution $D$…

Data Structures and Algorithms · Computer Science 2024-07-08 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Ankit Pensia , Thanasis Pittas

The vast majority of theoretical results in machine learning and statistics assume that the available training data is a reasonably reliable reflection of the phenomena to be learned or estimated. Similarly, the majority of machine learning…

Machine Learning · Computer Science 2017-06-13 Moses Charikar , Jacob Steinhardt , Gregory Valiant

Learning from data in the presence of outliers is a fundamental problem in statistics. Until recently, no computationally efficient algorithms were known to compute the mean of a high dimensional distribution under natural assumptions in…

Data Structures and Algorithms · Computer Science 2021-01-22 Yeshwanth Cherapanamjeri , Sidhanth Mohanty , Morris Yau

We give the first polynomial time algorithm for \emph{list-decodable covariance estimation}. For any $\alpha > 0$, our algorithm takes input a sample $Y \subseteq \mathbb{R}^d$ of size $n\geq d^{\mathsf{poly}(1/\alpha)}$ obtained by…

Data Structures and Algorithms · Computer Science 2022-06-23 Misha Ivkov , Pravesh K. Kothari

Learning from data in the presence of outliers is a fundamental problem in statistics. In this work, we study robust statistics in the presence of overwhelming outliers for the fundamental problem of subspace recovery. Given a dataset where…

Data Structures and Algorithms · Computer Science 2020-02-11 Prasad Raghavendra , Morris Yau

We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set $T$ of labeled examples $(x, y) \in \mathbb{R}^d \times \mathbb{R}$ and a parameter $0<…

Data Structures and Algorithms · Computer Science 2021-06-18 Ilias Diakonikolas , Daniel M. Kane , Ankit Pensia , Thanasis Pittas , Alistair Stewart

In many learning applications, data are collected from multiple sources, each providing a \emph{batch} of samples that by itself is insufficient to learn its input-output relationship. A common approach assumes that the sources fall in one…

Machine Learning · Computer Science 2023-09-06 Ayush Jain , Rajat Sen , Weihao Kong , Abhimanyu Das , Alon Orlitsky

Many applications, including natural language processing, sensor networks, collaborative filtering, and federated learning, call for estimating discrete distributions from data collected in batches, some of which may be untrustworthy,…

Machine Learning · Computer Science 2020-02-26 Ayush Jain , Alon Orlitsky

We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved…

Data Structures and Algorithms · Computer Science 2017-11-21 Ilias Diakonikolas , Daniel M. Kane , Alistair Stewart

We study the problem of {\em list-decodable mean estimation} for bounded covariance distributions. Specifically, we are given a set $T$ of points in $\mathbb{R}^d$ with the promise that an unknown $\alpha$-fraction of points in $T$, where…

Machine Learning · Computer Science 2020-06-23 Ilias Diakonikolas , Daniel M. Kane , Daniel Kongsgaard

We consider the problem of learning a discrete distribution in the presence of an $\epsilon$ fraction of malicious data sources. Specifically, we consider the setting where there is some underlying distribution, $p$, and each data source…

Machine Learning · Computer Science 2017-11-23 Mingda Qiao , Gregory Valiant
‹ Prev 1 2 3 10 Next ›