English
Related papers

Related papers: BLoad: Enhancing Neural Network Training with Effi…

200 papers

This paper presents a new method for pre-training neural networks that can decrease the total training time for a neural network while maintaining the final performance, which motivates its use on deep neural networks. By partitioning the…

Neural and Evolutionary Computing · Computer Science 2016-01-05 Conrado S. Miranda , Fernando J. Von Zuben

This paper presents a comparative analysis of distributed training strategies for large-scale neural networks, focusing on data parallelism, model parallelism, and hybrid approaches. We evaluate these strategies on image classification…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-01 Vishnu Vardhan Baligodugula , Fathi Amsaad

Large-scale supervised classification algorithms, especially those based on deep convolutional neural networks (DCNNs), require vast amounts of training data to achieve state-of-the-art performance. Decreasing this data requirement would…

Computer Vision and Pattern Recognition · Computer Science 2016-06-15 Maya Kabkab , Azadeh Alavi , Rama Chellappa

Graph Neural Networks (GNNs) have become powerful tools for learning from graph-structured data, finding applications across diverse domains. However, as graph sizes and connectivity increase, standard GNN training methods face significant…

Machine Learning · Computer Science 2025-12-01 Eshed Gal , Moshe Eliasof , Carola-Bibiane Schönlieb , Ivan I. Kyrchei , Eldad Haber , Eran Treister

The width of a neural network matters since increasing the width will necessarily increase the model capacity. However, the performance of a network does not improve linearly with the width and soon gets saturated. In this case, we argue…

Computer Vision and Pattern Recognition · Computer Science 2022-09-07 Shuai Zhao , Liguang Zhou , Wenxiao Wang , Deng Cai , Tin Lun Lam , Yangsheng Xu

Data loading can dominate deep neural network training time on large-scale systems. We present a comprehensive study on accelerating data loading performance in large-scale distributed training. We first identify performance and scalability…

Machine Learning · Computer Science 2020-02-20 Chih-Chieh Yang , Guojing Cong

A sequential training method for large-scale feedforward neural networks is presented. Each layer of the neural network is decoupled and trained separately. After the training is completed for each layer, they are combined together. The…

Machine Learning · Computer Science 2019-05-21 Jongrae Kim

Recent hardware developments have dramatically increased the scale of data parallelism available for neural network training. Among the simplest ways to harness next-generation hardware is to increase the batch size in standard mini-batch…

Machine Learning · Computer Science 2019-07-22 Christopher J. Shallue , Jaehoon Lee , Joseph Antognini , Jascha Sohl-Dickstein , Roy Frostig , George E. Dahl

In this paper we present a technique to train neural network models on small amounts of data. Current methods for training neural networks on small amounts of rich data typically rely on strategies such as fine-tuning a pre-trained neural…

Machine Learning · Computer Science 2016-11-08 Ark Anderson , Kyle Shaffer , Artem Yankov , Court D. Corley , Nathan O. Hodas

Continual learning of deep neural networks is a key requirement for scaling them up to more complex applicative scenarios and for achieving real lifelong learning of these architectures. Previous approaches to the problem have considered…

Machine Learning · Computer Science 2020-06-25 Jary Pomponi , Simone Scardapane , Vincenzo Lomonaco , Aurelio Uncini

Deep learning has excelled in image recognition tasks through neural networks inspired by the human brain. However, the necessity for large models to improve prediction accuracy introduces significant computational demands and extended…

Computer Vision and Pattern Recognition · Computer Science 2024-08-27 Taigo Sakai , Kazuhiro Hotta

Distributed machine learning is critical for training deep learning models on large datasets with numerous parameters. Current research primarily focuses on leveraging additional hardware resources and powerful computing units to accelerate…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-03 Kuan-Wei Lu , Ding-Yong Hong , Pangfeng Liu , Jan-Jan Wu

Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large…

Machine Learning · Computer Science 2022-12-20 Jean-Roch Vlimant , Junqi Yin

Effective training of today's large language models (LLMs) depends on large batches and long sequences for throughput and accuracy. To handle variable-length sequences on hardware accelerators, it is common practice to introduce padding…

Computation and Language · Computer Science 2022-10-07 Mario Michael Krell , Matej Kosec , Sergio P. Perez , Andrew Fitzgibbon

Modern deep neural network training is typically based on mini-batch stochastic gradient optimization. While the use of large mini-batches increases the available computational parallelism, small batch training has been shown to provide…

Machine Learning · Computer Science 2018-04-23 Dominic Masters , Carlo Luschi

Recurrent Neural Networks (RNN) are widely used to solve a variety of problems and as the quantity of data and the amount of available compute have increased, so have model sizes. The number of parameters in recent state-of-the-art networks…

Machine Learning · Computer Science 2017-11-08 Sharan Narang , Erich Elsen , Gregory Diamos , Shubho Sengupta

The success of modern deep learning is attributed to two key elements: huge amounts of training data and large model sizes. Where a vast amount of data allows the model to learn more features, the large model architecture boosts the…

Machine Learning · Computer Science 2024-10-08 Muhammad Asif Khan , Ridha Hamila , Hamid Menouar

Training machine learning models in parallel is an increasingly important workload. We accelerate distributed parallel training by designing a communication primitive that uses a programmable switch dataplane to execute a key step of the…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-01 Amedeo Sapio , Marco Canini , Chen-Yu Ho , Jacob Nelson , Panos Kalnis , Changhoon Kim , Arvind Krishnamurthy , Masoud Moshref , Dan R. K. Ports , Peter Richtárik

Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-21 Shang-Xuan Zou , Chun-Yen Chen , Jui-Lin Wu , Chun-Nan Chou , Chia-Chin Tsao , Kuan-Chieh Tung , Ting-Wei Lin , Cheng-Lung Sung , Edward Y. Chang

Deep learning applications are drastically progressing in seismic processing and interpretation tasks. However, the majority of approaches subsample data volumes and restrict model sizes to minimise computational requirements. Subsampling…

Geophysics · Physics 2021-02-26 Claire Birnie , Haithem Jarraya , Fredrik Hansteen
‹ Prev 1 2 3 10 Next ›