Related papers: Deep Combinatorial Aggregation

Averaging Weights Leads to Wider Optima and Better Generalization

Deep neural networks are typically trained by optimizing a loss function with an SGD variant, in conjunction with a decaying learning rate, until convergence. We show that simple averaging of multiple points along the trajectory of SGD,…

Machine Learning · Computer Science 2019-02-26 Pavel Izmailov , Dmitrii Podoprikhin , Timur Garipov , Dmitry Vetrov , Andrew Gordon Wilson

Trainable Weight Averaging: Accelerating Training and Improving Generalization

Weight averaging is a widely used technique for accelerating training and improving the generalization of deep neural networks (DNNs). While existing approaches like stochastic weight averaging (SWA) rely on pre-set weighting schemes, they…

Machine Learning · Computer Science 2025-02-11 Tao Li , Zhehao Huang , Yingwen Wu , Zhengbao He , Qinghua Tao , Xiaolin Huang , Chih-Jen Lin

Deep Component Analysis via Alternating Direction Neural Networks

Despite a lack of theoretical understanding, deep neural networks have achieved unparalleled performance in a wide range of applications. On the other hand, shallow representation learning with component analysis is associated with rich…

Machine Learning · Computer Science 2018-03-20 Calvin Murdock , Ming-Fang Chang , Simon Lucey

Hierarchical Weight Averaging for Deep Neural Networks

Despite the simplicity, stochastic gradient descent (SGD)-like algorithms are successful in training deep neural networks (DNNs). Among various attempts to improve SGD, weight averaging (WA), which averages the weights of multiple models,…

Machine Learning · Computer Science 2023-04-25 Xiaozhe Gu , Zixun Zhang , Yuncheng Jiang , Tao Luo , Ruimao Zhang , Shuguang Cui , Zhen Li

Diverse Weight Averaging for Out-of-Distribution Generalization

Standard neural networks struggle to generalize under distribution shifts in computer vision. Fortunately, combining multiple networks can consistently improve out-of-distribution generalization. In particular, weight averaging (WA)…

Computer Vision and Pattern Recognition · Computer Science 2023-01-30 Alexandre Ramé , Matthieu Kirchmeyer , Thibaud Rahier , Alain Rakotomamonjy , Patrick Gallinari , Matthieu Cord

Distributed Weight Consolidation: A Brain Segmentation Case Study

Collecting the large datasets needed to train deep neural networks can be very difficult, particularly for the many applications for which sharing and pooling data is complicated by practical, ethical, or legal concerns. However, it may be…

Machine Learning · Computer Science 2019-01-17 Patrick McClure , Charles Y. Zheng , Jakub R. Kaczmarzyk , John A. Lee , Satrajit S. Ghosh , Dylan Nielson , Peter Bandettini , Francisco Pereira

SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks

Designing a deep neural network (DNN) with good generalization capability is a complex process especially when the weights are severely quantized. Model averaging is a promising approach for achieving the good generalization capability of…

Machine Learning · Computer Science 2020-02-04 Sungho Shin , Yoonho Boo , Wonyong Sung

Deep Negative Correlation Classification

Ensemble learning serves as a straightforward way to improve the performance of almost any machine learning algorithm. Existing deep ensemble methods usually naively train many different models and then aggregate their predictions. This is…

Computer Vision and Pattern Recognition · Computer Science 2022-12-15 Le Zhang , Qibin Hou , Yun Liu , Jia-Wang Bian , Xun Xu , Joey Tianyi Zhou , Ce Zhu

Novel Uncertainty Framework for Deep Learning Ensembles

Deep neural networks have become the default choice for many of the machine learning tasks such as classification and regression. Dropout, a method commonly used to improve the convergence of deep neural networks, generates an ensemble of…

Machine Learning · Statistics 2019-04-11 Tal Kachman , Michal Moshkovitz , Michal Rosen-Zvi

Deep-Ensemble-Based Uncertainty Quantification in Spatiotemporal Graph Neural Networks for Traffic Forecasting

Deep-learning-based data-driven forecasting methods have produced impressive results for traffic forecasting. A major limitation of these methods, however, is that they provide forecasts without estimates of uncertainty, which are critical…

Machine Learning · Computer Science 2022-04-07 Tanwi Mallick , Prasanna Balaprakash , Jane Macfarlane

A Simple Baseline for Bayesian Uncertainty in Deep Learning

We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient…

Machine Learning · Computer Science 2020-01-01 Wesley Maddox , Timur Garipov , Pavel Izmailov , Dmitry Vetrov , Andrew Gordon Wilson

Deep Gaussian Mixture Ensembles

This work introduces a novel probabilistic deep learning technique called deep Gaussian mixture ensembles (DGMEs), which enables accurate quantification of both epistemic and aleatoric uncertainty. By assuming the data generating process…

Machine Learning · Statistics 2023-06-13 Yousef El-Laham , Niccolò Dalmasso , Elizabeth Fons , Svitlana Vyetrenko

WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average

The performance of deep neural networks is enhanced by ensemble methods, which average the output of several models. However, this comes at an increased cost at inference. Weight averaging methods aim at balancing the generalization of…

Machine Learning · Computer Science 2024-05-29 Louis Fournier , Adel Nabli , Masih Aminbeidokhti , Marco Pedersoli , Eugene Belilovsky , Edouard Oyallon

FeTa: A DCA Pruning Algorithm with Generalization Error Guarantees

Recent DNN pruning algorithms have succeeded in reducing the number of parameters in fully connected layers, often with little or no drop in classification accuracy. However, most of the existing pruning schemes either have to be applied…

Machine Learning · Computer Science 2018-03-13 Konstantinos Pitas , Mike Davies , Pierre Vandergheynst

Do Deep Ensembles Actually Capture Uncertainty in Graph Neural Networks?

While deep ensembles are widely considered to be the default method for uncertainty quantification in deep learning, their effectiveness for graph-structured data is often simply assumed based on successes in domains like computer vision.…

Machine Learning · Computer Science 2026-05-22 Pedro C. Vieira , Pedro Ribeiro , Viacheslav Borovitskiy

Robust Modeling of Unknown Dynamical Systems via Ensemble Averaged Learning

Recent work has focused on data-driven learning of the evolution of unknown systems via deep neural networks (DNNs), with the goal of conducting long time prediction of the evolution of the unknown system. Training a DNN with low…

Machine Learning · Computer Science 2022-12-28 Victor Churchill , Steve Manns , Zhen Chen , Dongbin Xiu

Fast Uncertainty Estimates in Deep Learning Interatomic Potentials

Deep learning has emerged as a promising paradigm to give access to highly accurate predictions of molecular and materials properties. A common short-coming shared by current approaches, however, is that neural networks only give point…

Computational Physics · Physics 2023-05-10 Albert Zhu , Simon Batzner , Albert Musaelian , Boris Kozinsky

Predictive Uncertainty Quantification with Compound Density Networks

Despite the huge success of deep neural networks (NNs), finding good mechanisms for quantifying their prediction uncertainty is still an open problem. Bayesian neural networks are one of the most popular approaches to uncertainty…

Machine Learning · Statistics 2020-01-01 Agustinus Kristiadi , Sina Däubener , Asja Fischer

Ensemble deep learning: A review

Ensemble learning combines several individual models to obtain better generalization performance. Currently, deep learning architectures are showing better performance compared to the shallow or traditional models. Deep ensemble learning…

Machine Learning · Computer Science 2022-08-09 M. A. Ganaie , Minghui Hu , A. K. Malik , M. Tanveer , P. N. Suganthan

GRAWA: Gradient-based Weighted Averaging for Distributed Training of Deep Learning Models

We study distributed training of deep learning models in time-constrained environments. We propose a new algorithm that periodically pulls workers towards the center variable computed as a weighted average of workers, where the weights are…

Machine Learning · Computer Science 2024-03-08 Tolga Dimlioglu , Anna Choromanska