Related papers: A general framework for ensemble distribution dist…

Ensemble Distillation for Structured Prediction: Calibrated, Accurate, Fast-Choose Three

Modern neural networks do not always produce well-calibrated predictions, even when trained with a proper scoring function such as cross-entropy. In classification settings, simple methods such as isotonic regression or temperature scaling…

Machine Learning · Computer Science 2021-03-26 Steven Reich , David Mueller , Nicholas Andrews

Ensemble Distribution Distillation

Ensembles of models often yield improvements in system performance. These ensemble approaches have also been empirically shown to yield robust measures of uncertainty, and are capable of distinguishing between different \emph{forms} of…

Machine Learning · Statistics 2019-11-27 Andrey Malinin , Bruno Mlodozeniec , Mark Gales

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

We formally study how ensemble of deep learning models can improve test accuracy, and how the superior performance of ensemble can be distilled into a single model using knowledge distillation. We consider the challenging case where the…

Machine Learning · Computer Science 2023-02-16 Zeyuan Allen-Zhu , Yuanzhi Li

Hydra: Preserving Ensemble Diversity for Model Distillation

Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling…

Machine Learning · Computer Science 2021-03-22 Linh Tran , Bastiaan S. Veeling , Kevin Roth , Jakub Swiatkowski , Joshua V. Dillon , Jasper Snoek , Stephan Mandt , Tim Salimans , Sebastian Nowozin , Rodolphe Jenatton

Diversity Matters When Learning From Ensembles

Deep ensembles excel in large-scale image classification tasks both in terms of prediction accuracy and calibration. Despite being simple to train, the computation and memory cost of deep ensembles limits their practicability. While some…

Machine Learning · Computer Science 2021-10-28 Giung Nam , Jongmin Yoon , Yoonho Lee , Juho Lee

Anti-Distillation: Improving reproducibility of deep networks

Deep networks have been revolutionary in improving performance of machine learning and artificial intelligence systems. Their high prediction accuracy, however, comes at a price of \emph{model irreproducibility\/} in very high levels that…

Machine Learning · Computer Science 2020-10-21 Gil I. Shamir , Lorenzo Coviello

Online Ensemble Model Compression using Knowledge Distillation

This paper presents a novel knowledge distillation based model compression framework consisting of a student ensemble. It enables distillation of simultaneously learnt ensemble knowledge onto each of the compressed student models. Each…

Computer Vision and Pattern Recognition · Computer Science 2020-11-17 Devesh Walawalkar , Zhiqiang Shen , Marios Savvides

Uncertainty-Aware Retinal Vessel Segmentation via Ensemble Distillation

Uncertainty estimation is critical for reliable medical image segmentation, particularly in retinal vessel analysis, where accurate predictions are essential for diagnostic applications. Deep Ensembles, where multiple networks are trained…

Computer Vision and Pattern Recognition · Computer Science 2025-09-16 Jeremiah Fadugba , Petru Manescu , Bolanle Oladejo , Delmiro Fernandez-Reyes , Philipp Berens

EnsembleNet: End-to-End Optimization of Multi-headed Models

Ensembling is a universally useful approach to boost the performance of machine learning models. However, individual models in an ensemble were traditionally trained independently in separate stages without information access about the…

Computer Vision and Pattern Recognition · Computer Science 2019-09-27 Hanhan Li , Joe Yue-Hei Ng , Paul Natsev

Self-Distribution Distillation: Efficient Uncertainty Estimation

Deep learning is increasingly being applied in safety-critical domains. For these scenarios it is important to know the level of uncertainty in a model's prediction to ensure appropriate decisions are made by the system. Deep ensembles are…

Machine Learning · Computer Science 2022-03-17 Yassir Fathullah , Mark J. F. Gales

Scaling Motion Forecasting Models with Ensemble Distillation

Motion forecasting has become an increasingly critical component of autonomous robotic systems. Onboard compute budgets typically limit the accuracy of real-time systems. In this work we propose methods of improving motion forecasting…

Robotics · Computer Science 2024-05-15 Scott Ettinger , Kratarth Goel , Avikalp Srivastava , Rami Al-Rfou

Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation

Ensembles of deep neural networks have demonstrated superior performance, but their heavy computational cost hinders applying them for resource-limited environments. It motivates distilling knowledge from the ensemble teacher into a smaller…

Machine Learning · Computer Science 2022-07-01 Giung Nam , Hyungi Lee , Byeongho Heo , Juho Lee

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

Ensemble models comprising of deep Convolutional Neural Networks (CNN) have shown significant improvements in model generalization but at the cost of large computation and memory requirements. In this paper, we present a framework for…

Computer Vision and Pattern Recognition · Computer Science 2020-04-03 Umar Asif , Jianbin Tang , Stefan Harrer

Distilling Model Knowledge

Top-performing machine learning systems, such as deep neural networks, large ensembles and complex probabilistic graphical models, can be expensive to store, slow to evaluate and hard to integrate into larger systems. Ideally, we would like…

Machine Learning · Statistics 2015-10-09 George Papamakarios

Distilling the Knowledge in a Neural Network

A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of…

Machine Learning · Statistics 2015-03-10 Geoffrey Hinton , Oriol Vinyals , Jeff Dean

Uncertainty quantification is a critical aspect of reinforcement learning and deep learning, with numerous applications ranging from efficient exploration and stable offline reinforcement learning to outlier detection in medical…

Machine Learning · Computer Science 2025-03-27 Moritz A. Zanger , Pascal R. Van der Vaart , Wendelin Böhmer , Matthijs T. J. Spaan

Ensemble Distillation Approaches for Grammatical Error Correction

Ensemble approaches are commonly used techniques to improving a system by combining multiple model predictions. Additionally these schemes allow the uncertainty, as well as the source of the uncertainty, to be derived for the prediction.…

Computation and Language · Computer Science 2020-12-16 Yassir Fathullah , Mark Gales , Andrey Malinin

Unified and Effective Ensemble Knowledge Distillation

Ensemble knowledge distillation can extract knowledge from multiple teacher models and encode it into a single student model. Many existing methods learn and distill the student model on labeled data only. However, the teacher models are…

Machine Learning · Computer Science 2022-04-04 Chuhan Wu , Fangzhao Wu , Tao Qi , Yongfeng Huang

DUDES: Deep Uncertainty Distillation using Ensembles for Semantic Segmentation

Deep neural networks lack interpretability and tend to be overconfident, which poses a serious problem in safety-critical applications like autonomous driving, medical imaging, or machine vision tasks with high demands on reliability.…

Computer Vision and Pattern Recognition · Computer Science 2024-03-27 Steven Landgraf , Kira Wursthorn , Markus Hillemann , Markus Ulrich

Functional Ensemble Distillation

Bayesian models have many desirable properties, most notable is their ability to generalize from limited data and to properly estimate the uncertainty in their predictions. However, these benefits come at a steep computational cost as…

Machine Learning · Computer Science 2022-06-07 Coby Penso , Idan Achituve , Ethan Fetaya