Related papers: Local Learning with Neuron Groups

Parallel Training of Deep Networks with Local Updates

Deep learning models trained on large data sets have been widely successful in both vision and language domains. As state-of-the-art deep learning architectures have continued to grow in parameter count so have the compute budgets and times…

Machine Learning · Computer Science 2021-06-16 Michael Laskin , Luke Metz , Seth Nabarro , Mark Saroufim , Badreddine Noune , Carlo Luschi , Jascha Sohl-Dickstein , Pieter Abbeel

Locally Supervised Learning with Periodic Global Guidance

Locally supervised learning aims to train a neural network based on a local estimation of the global loss function at each decoupled module of the network. Auxiliary networks are typically appended to the modules to approximate the gradient…

Machine Learning · Computer Science 2022-08-02 Hasnain Irshad Bhatti , Jaekyun Moon

Local Critic Training for Model-Parallel Learning of Deep Neural Networks

In this paper, we propose a novel model-parallel learning method, called local critic training, which trains neural networks using additional modules called local critic networks. The main network is divided into several layer groups and…

Machine Learning · Computer Science 2021-02-04 Hojung Lee , Cho-Jui Hsieh , Jong-Seok Lee

Training Neural Networks with Local Error Signals

Supervised training of neural networks for classification is typically performed with a global loss function. The loss function provides a gradient for the output layer, and this gradient is back-propagated to hidden layers to dictate an…

Machine Learning · Statistics 2019-05-09 Arild Nøkland , Lars Hiller Eidnes

Interlocking Backpropagation: Improving depthwise model-parallelism

The number of parameters in state of the art neural networks has drastically increased in recent years. This surge of interest in large scale neural networks has motivated the development of new distributed training strategies enabling such…

Machine Learning · Computer Science 2022-07-11 Aidan N. Gomez , Oscar Key , Kuba Perlin , Stephen Gou , Nick Frosst , Jeff Dean , Yarin Gal

A Theory of Local Learning, the Learning Channel, and the Optimality of Backpropagation

In a physical neural system, where storage and processing are intimately intertwined, the rules for adjusting the synaptic weights can only depend on variables that are available locally, such as the activity of the pre- and post-synaptic…

Machine Learning · Computer Science 2016-10-25 Pierre Baldi , Peter Sadowski

Collaborative Learning over Wireless Networks: An Introductory Overview

In this chapter, we will mainly focus on collaborative training across wireless devices. Training a ML model is equivalent to solving an optimization problem, and many distributed optimization algorithms have been developed over the last…

Machine Learning · Computer Science 2021-12-13 Emre Ozfatura , Deniz Gunduz , H. Vincent Poor

Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

Relieving the reliance of neural network training on a global back-propagation (BP) has emerged as a notable research topic due to the biological implausibility and huge memory consumption caused by BP. Among the existing solutions, local…

Machine Learning · Computer Science 2024-06-11 Yibo Yang , Xiaojie Li , Motasem Alfarra , Hasan Hammoud , Adel Bibi , Philip Torr , Bernard Ghanem

Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks

Despite the recent success of Graph Neural Networks (GNNs), training GNNs on large graphs remains challenging. The limited resource capacities of the existing servers, the dependency between nodes in a graph, and the privacy concern due to…

Machine Learning · Computer Science 2022-03-15 Morteza Ramezani , Weilin Cong , Mehrdad Mahdavi , Mahmut T. Kandemir , Anand Sivasubramaniam

Modular Duality in Deep Learning

An old idea in optimization theory says that since the gradient is a dual vector it may not be subtracted from the weights without first being mapped to the primal space where the weights reside. We take this idea seriously in this paper…

Machine Learning · Computer Science 2024-12-09 Jeremy Bernstein , Laker Newhouse

Personalized Federated Learning through Local Memorization

Federated learning allows clients to collaboratively learn statistical models while keeping their data local. Federated learning was originally used to train a unique global model to be served to all clients, but this approach might be…

Machine Learning · Computer Science 2022-06-20 Othmane Marfoq , Giovanni Neglia , Laetitia Kameni , Richard Vidal

Distributed Training of Deep Learning Models: A Taxonomic Perspective

Distributed deep learning systems (DDLS) train deep neural network models by utilizing the distributed resources of a cluster. Developers of DDLS are required to make many decisions to process their particular workloads in their chosen…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-09 Matthias Langer , Zhen He , Wenny Rahayu , Yanbo Xue

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Biased Local SGD for Efficient Deep Learning on Heterogeneous Systems

Most parallel neural network training methods assume homogeneous computing resources. For example, synchronous data-parallel SGD suffers from significant synchronization overhead under heterogeneous workloads, often forcing practitioners to…

Machine Learning · Computer Science 2026-02-24 Jihyun Lim , Junhyuk Jo , Chanhyeok Ko , Young Min Go , Jimin Hwa , Sunwoo Lee

Local Critic Training of Deep Neural Networks

This paper proposes a novel approach to train deep neural networks by unlocking the layer-wise dependency of backpropagation training. The approach employs additional modules called local critic networks besides the main network model to be…

Machine Learning · Computer Science 2018-09-28 Hojung Lee , Jong-seok Lee

LLS: Local Learning Rule for Deep Neural Networks Inspired by Neural Activity Synchronization

Training deep neural networks (DNNs) using traditional backpropagation (BP) presents challenges in terms of computational complexity and energy consumption, particularly for on-device learning where computational resources are limited.…

Neural and Evolutionary Computing · Computer Science 2025-07-08 Marco Paul E. Apolinario , Arani Roy , Kaushik Roy

Tackling the Local Bias in Federated Graph Learning

Federated graph learning (FGL) has become an important research topic in response to the increasing scale and the distributed nature of graph-structured data in the real world. In FGL, a global graph is distributed across different clients,…

Machine Learning · Computer Science 2024-08-27 Binchi Zhang , Minnan Luo , Shangbin Feng , Ziqi Liu , Jun Zhou , Qinghua Zheng

Collaborative Learning for Deep Neural Networks

We introduce collaborative learning in which multiple classifier heads of the same network are simultaneously trained on the same training data to improve generalization and robustness to label noise with no extra inference cost. It…

Machine Learning · Statistics 2018-11-08 Guocong Song , Wei Chai

Think Locally, Act Globally: Federated Learning with Local and Global Representations

Federated learning is a method of training models on private data distributed over multiple devices. To keep device data private, the global model is trained by only communicating parameters and updates which poses scalability challenges…

Machine Learning · Computer Science 2020-07-15 Paul Pu Liang , Terrance Liu , Liu Ziyin , Nicholas B. Allen , Randy P. Auerbach , David Brent , Ruslan Salakhutdinov , Louis-Philippe Morency

Faster Multi-GPU Training with PPLL: A Pipeline Parallelism Framework Leveraging Local Learning

Currently, training large-scale deep learning models is typically achieved through parallel training across multiple GPUs. However, due to the inherent communication overhead and synchronization delays in traditional model parallelism…

Computer Vision and Pattern Recognition · Computer Science 2024-11-21 Xiuyuan Guo , Chengqi Xu , Guinan Guo , Feiyu Zhu , Changpeng Cai , Peizhe Wang , Xiaoming Wei , Junhao Su , Jialin Gao