Related papers: Distributed linear regression by averaging

Linear Regression with Distributed Learning: A Generalization Error Perspective

Distributed learning provides an attractive framework for scaling the learning task by sharing the computational load over multiple nodes in a network. Here, we investigate the performance of distributed learning for large-scale linear…

Machine Learning · Statistics 2021-11-03 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Information-Theoretic Perspective of Federated Learning

An approach to distributed machine learning is to train models on local datasets and aggregate these models into a single, stronger model. A popular instance of this form of parallelization is federated learning, where the nodes…

Machine Learning · Computer Science 2019-11-19 Linara Adilova , Julia Rosenzweig , Michael Kamp

On the Optimality of Averaging in Distributed Statistical Learning

A common approach to statistical learning with big-data is to randomly split it among $m$ machines and learn the parameter of interest by averaging the $m$ individual estimates. In this paper, focusing on empirical risk minimization, or…

Machine Learning · Statistics 2016-06-14 Jonathan Rosenblatt , Boaz Nadler

Generalization Error for Linear Regression under Distributed Learning

Distributed learning facilitates the scaling-up of data processing by distributing the computational burden over several nodes. Despite the vast interest in distributed learning, generalization performance of such approaches is not well…

Machine Learning · Statistics 2020-05-05 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Selective Inference with Distributed Data

As datasets grow larger, they are often distributed across multiple machines that compute in parallel and communicate with a central machine through short messages. In this paper, we focus on sparse regression and propose a new procedure…

Methodology · Statistics 2023-03-14 Sifan Liu , Snigdha Panigrahi

Distribution Regression for Sequential Data

Distribution regression refers to the supervised learning problem where labels are only available for groups of inputs instead of individual inputs. In this paper, we develop a rigorous mathematical framework for distribution regression…

Machine Learning · Computer Science 2021-09-30 Maud Lemercier , Cristopher Salvi , Theodoros Damoulas , Edwin V. Bonilla , Terry Lyons

Distributed Kernel Regression: An Algorithm for Training Collaboratively

This paper addresses the problem of distributed learning under communication constraints, motivated by distributed signal processing in wireless sensor networks and data mining with distributed databases. After formalizing a general model…

Machine Learning · Computer Science 2016-11-15 Joel B. Predd , Sanjeev R. Kulkarni , H. Vincent Poor

Distributed Learning over Unreliable Networks

Most of today's distributed machine learning systems assume {\em reliable networks}: whenever two machines exchange information (e.g., gradients or models), the network should guarantee the delivery of the message. At the same time, recent…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-05-17 Chen Yu , Hanlin Tang , Cedric Renggli , Simon Kassing , Ankit Singla , Dan Alistarh , Ce Zhang , Ji Liu

Distributed Sparse Linear Regression under Communication Constraints

In multiple domains, statistical tasks are performed in distributed settings, with data split among several end machines that are connected to a fusion center. In various applications, the end machines have limited bandwidth and power, and…

Machine Learning · Computer Science 2026-01-05 Rodney Fonseca , Boaz Nadler

Learning Theory of Distributed Regression with Bias Corrected Regularization Kernel Network

Distributed learning is an effective way to analyze big data. In distributed regression, a typical approach is to divide the big data into multiple blocks, apply a base regression algorithm on each of them, and then simply average the…

Machine Learning · Computer Science 2017-08-08 Zhengchu Guo , Lei Shi , Qiang Wu

Distributed Parameter Map-Reduce

This paper describes how to convert a machine learning problem into a series of map-reduce tasks. We study logistic regression algorithm. In logistic regression algorithm, it is assumed that samples are independent and each sample is…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-10-06 Qi Li

High-Dimensional Distributed Sparse Classification with Scalable Communication-Efficient Global Updates

As the size of datasets used in statistical learning continues to grow, distributed training of models has attracted increasing attention. These methods partition the data and exploit parallelism to reduce memory and runtime, but suffer…

Machine Learning · Computer Science 2024-07-10 Fred Lu , Ryan R. Curtin , Edward Raff , Francis Ferraro , James Holt

Distributed Networked Learning with Correlated Data

We consider a distributed estimation method in a setting with heterogeneous streams of correlated data distributed across nodes in a network. In the considered approach, linear models are estimated locally (i.e., with only local data)…

Machine Learning · Computer Science 2021-02-11 Lingzhou Hong , Alfredo Garcia , Ceyhun Eksin

Learning Entangled Single-Sample Distributions via Iterative Trimming

In the setting of entangled single-sample distributions, the goal is to estimate some common parameter shared by a family of distributions, given one \emph{single} sample from each distribution. We study mean estimation and linear…

Machine Learning · Computer Science 2020-07-08 Hui Yuan , Yingyu Liang

Distributed Continual Learning with CoCoA in High-dimensional Linear Regression

We consider estimation under scenarios where the signals of interest exhibit change of characteristics over time. In particular, we consider the continual learning problem where different tasks, e.g., data with different distributions,…

Machine Learning · Computer Science 2023-12-05 Martin Hellkvist , Ayça Özçelikkale , Anders Ahlén

Optimal Model Averaging: Towards Personalized Collaborative Learning

In federated learning, differences in the data or objectives between the participating nodes motivate approaches to train a personalized machine learning model for each node. One such approach is weighted averaging between a locally trained…

Machine Learning · Computer Science 2021-10-26 Felix Grimberg , Mary-Anne Hartley , Sai P. Karimireddy , Martin Jaggi

Distributed Machine Learning with Sparse Heterogeneous Data

Motivated by distributed machine learning settings such as Federated Learning, we consider the problem of fitting a statistical model across a distributed collection of heterogeneous data sets whose similarity structure is encoded by a…

Statistics Theory · Mathematics 2021-11-30 Dominic Richards , Sahand N. Negahban , Patrick Rebeschini

Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation

We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are high-dimensional, the communication cost of sending entire vectors may be…

Machine Learning · Computer Science 2021-10-18 Divyansh Jhunjhunwala , Ankur Mallick , Advait Gadhikar , Swanand Kadhe , Gauri Joshi

Distributed Online Linear Regression

We study online linear regression problems in a distributed setting, where the data is spread over a network. In each round, each network node proposes a linear predictor, with the objective of fitting the \emph{network-wide} data. It then…

Machine Learning · Computer Science 2019-02-14 Deming Yuan , Alexandre Proutiere , Guodong Shi

Transfer learning via Regularized Linear Discriminant Analysis

Linear discriminant analysis is a widely used method for classification. However, the high dimensionality of predictors combined with small sample sizes often results in large classification errors. To address this challenge, it is crucial…

Machine Learning · Statistics 2025-01-09 Hongzhe Zhang , Arnab Auddy , Hongzhe Lee