Related papers: Linear Regression using Heterogeneous Data Batches

Batch List-Decodable Linear Regression via Higher Moments

We study the task of list-decodable linear regression using batches. A batch is called clean if it consists of i.i.d. samples from an unknown linear regression distribution. For a parameter $\alpha \in (0, 1/2)$, an unknown…

Machine Learning · Computer Science 2025-03-14 Ilias Diakonikolas , Daniel M. Kane , Sushrut Karmalkar , Sihan Liu , Thanasis Pittas

BALI: Learning Neural Networks via Bayesian Layerwise Inference

We introduce a new method for learning Bayesian neural networks, treating them as a stack of multivariate Bayesian linear regression models. The main idea is to infer the layerwise posterior exactly if we know the target outputs of each…

Machine Learning · Computer Science 2024-11-20 Richard Kurle , Alexej Klushyn , Ralf Herbrich

Learning with Subset Stacking

We propose a new regression algorithm that learns from a set of input-output pairs. Our algorithm is designed for populations where the relation between the input variables and the output variable exhibits a heterogeneous behavior across…

Machine Learning · Computer Science 2026-02-17 Ş. İlker Birbil , Sinan Yıldırım , Samet Çopur , M. Hakan Akyüz

Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression

Learning algorithms that divide the data into batches are prevalent in many machine-learning applications, typically offering useful trade-offs between computational efficiency and performance. In this paper, we examine the benefits of…

Machine Learning · Computer Science 2024-09-24 Shahar Stein Ioushua , Inbar Hasidim , Ofer Shayevitz , Meir Feder

Learning Discrete Distributions from Untrusted Batches

We consider the problem of learning a discrete distribution in the presence of an $\epsilon$ fraction of malicious data sources. Specifically, we consider the setting where there is some underlying distribution, $p$, and each data source…

Machine Learning · Computer Science 2017-11-23 Mingda Qiao , Gregory Valiant

A block-random algorithm for learning on distributed, heterogeneous data

Most deep learning models are based on deep neural networks with multiple layers between input and output. The parameters defining these layers are initialized using random values and are "learned" from data, typically using stochastic…

Machine Learning · Computer Science 2019-03-05 Prakash Mohan , Marc T. Henry de Frahan , Ryan King , Ray W. Grout

Efficient List-Decodable Regression using Batches

We begin the study of list-decodable linear regression using batches. In this setting only an $\alpha \in (0,1]$ fraction of the batches are genuine. Each genuine batch contains $\ge n$ i.i.d. samples from a common unknown distribution and…

Machine Learning · Computer Science 2022-11-24 Abhimanyu Das , Ayush Jain , Weihao Kong , Rajat Sen

Linear-Sample Learning of Low-Rank Distributions

Many latent-variable applications, including community detection, collaborative filtering, genomic analysis, and NLP, model data as generated by low-rank matrices. Yet despite considerable research, except for very special cases, the number…

Machine Learning · Computer Science 2020-10-02 Ayush Jain , Alon Orlitsky

Learning Entangled Single-Sample Distributions via Iterative Trimming

In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear…

Machine Learning · Computer Science 2020-07-08 Hui Yuan , Yingyu Liang

Robust Meta-learning for Mixed Linear Regression with Small Batches

A common challenge faced in practical supervised learning, such as medical image processing and robotic interactions, is that there are plenty of tasks but each task cannot afford to collect enough labeled examples to be learned in…

Machine Learning · Computer Science 2020-06-22 Weihao Kong , Raghav Somani , Sham Kakade , Sewoong Oh

Uncertainty in Model-Agnostic Meta-Learning using Variational Inference

We introduce a new, rigorously-formulated Bayesian meta-learning algorithm that learns a probability distribution of model parameter prior for few-shot learning. The proposed algorithm employs a gradient-based variational inference to infer…

Machine Learning · Computer Science 2022-03-21 Cuong Nguyen , Thanh-Toan Do , Gustavo Carneiro

Bayesian Approaches to Distribution Regression

Distribution regression has recently attracted much interest as a generic solution to the problem of supervised learning where labels are available at the group level, rather than at the individual level. Current approaches, however, do not…

Machine Learning · Statistics 2021-01-18 Ho Chung Leon Law , Danica J. Sutherland , Dino Sejdinovic , Seth Flaxman

Robust Methods for High-Dimensional Linear Learning

We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting, where the number of features $d$ may exceed the sample size $n$. We employ, in a generic learning setting, two…

Machine Learning · Statistics 2023-05-30 Ibrahim Merad , Stéphane Gaïffas

A General Method for Robust Learning from Batches

In many applications, data is collected in batches, some of which are corrupt or even adversarial. Recent work derived optimal robust algorithms for estimating discrete distributions in this setting. We consider a general framework of…

Machine Learning · Statistics 2020-02-26 Ayush Jain , Alon Orlitsky

Sample Complexity of Learning Mixtures of Sparse Linear Regressions

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly…

Machine Learning · Computer Science 2019-11-01 Akshay Krishnamurthy , Arya Mazumdar , Andrew McGregor , Soumyabrata Pal

Investigating Batch Inference in a Sequential Monte Carlo Framework for Neural Networks

Bayesian inference allows us to define a posterior distribution over the weights of a generic neural network (NN). Exact posteriors are usually intractable, in which case approximations can be employed. One such approximation - variational…

Machine Learning · Computer Science 2026-01-30 Andrew Millard , Joshua Murphy , Peter Green , Simon Maskell

Inference in High-Dimensional Linear Regression via Lattice Basis Reduction and Integer Relation Detection

We focus on the high-dimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector $\beta^*\in\mathbb{R}^p$ from its linear measurements, using a small number $n$ of samples. Unlike most…

Statistics Theory · Mathematics 2023-09-19 David Gamarnik , Eren C. Kızıldağ , Ilias Zadik

Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data

A powerful concept behind much of the recent progress in machine learning is the extraction of common features across data from heterogeneous sources or tasks. Intuitively, using all of one's data to learn a common representation function…

Machine Learning · Statistics 2024-10-15 Thomas T. C. K. Zhang , Leonardo F. Toso , James Anderson , Nikolai Matni

Learning the Base Distribution in Implicit Generative Models

Popular generative model learning methods such as Generative Adversarial Networks (GANs), and Variational Autoencoders (VAE) enforce the latent representation to follow simple distributions such as isotropic Gaussian. In this paper, we…

Machine Learning · Computer Science 2018-03-15 Cem Subakan , Oluwasanmi Koyejo , Paris Smaragdis

Investigating the Histogram Loss in Regression

It is becoming increasingly common in regression to train neural networks that model the entire distribution even if only the mean is required for prediction. This additional modeling often comes with performance gain and the reasons behind…

Machine Learning · Computer Science 2024-10-22 Ehsan Imani , Kai Luedemann , Sam Scholnick-Hughes , Esraa Elelimy , Martha White