Related papers: Parallel Blockwise Knowledge Distillation for Deep…

Deep Neural Compression Via Concurrent Pruning and Self-Distillation

Pruning aims to reduce the number of parameters while maintaining performance close to the original network. This work proposes a novel \emph{self-distillation} based pruning strategy, whereby the representational similarity between the…

Machine Learning · Computer Science 2021-10-01 James O' Neill , Sourav Dutta , Haytham Assem

Distilling particle knowledge for fast reconstruction at high-energy physics experiments

Knowledge distillation is a form of model compression that allows artificial neural networks of different sizes to learn from one another. Its main application is the compactification of large deep neural networks to free up computational…

High Energy Physics - Experiment · Physics 2024-05-08 Aritra Bal , Tristan Brandes , Fabio Iemmi , Markus Klute , Benedikt Maier , Vinicius Mikuni , Thea Aarrestad

Distilling Spikes: Knowledge Distillation in Spiking Neural Networks

Spiking Neural Networks (SNN) are energy-efficient computing architectures that exchange spikes for processing information, unlike classical Artificial Neural Networks (ANN). Due to this, SNNs are better suited for real-life deployments.…

Neural and Evolutionary Computing · Computer Science 2020-05-04 Ravi Kumar Kushawaha , Saurabh Kumar , Biplab Banerjee , Rajbabu Velmurugan

Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods…

Computer Vision and Pattern Recognition · Computer Science 2020-10-12 Mahdi Ghorbani , Fahimeh Fooladgar , Shohreh Kasaei

Distilling the knowledge with quantum neural networks

Quantum Neural Networks (QNNs) are a promising class of quantum machine learning models with potential quantum advantages when implemented on scalable, error-corrected quantum computers. However, as system sizes increase, deploying QNNs…

Quantum Physics · Physics 2026-03-24 Yuxuan Yan , Sitian Qian , Qi Zhao , Xingjian Zhang

Energy-efficient Knowledge Distillation for Spiking Neural Networks

Spiking neural networks (SNNs) have been gaining interest as energy-efficient alternatives of conventional artificial neural networks (ANNs) due to their event-driven computation. Considering the future deployment of SNN models to…

Neural and Evolutionary Computing · Computer Science 2022-06-28 Dongjin Lee , Seongsik Park , Jongwan Kim , Wuhyeong Doh , Sungroh Yoon

Large scale distributed neural network training through online distillation

Techniques such as ensembling and distillation promise model quality improvements when paired with almost any base model. However, due to increased test-time cost (for ensembles) and increased complexity of the training pipeline (for…

Machine Learning · Computer Science 2020-08-24 Rohan Anil , Gabriel Pereyra , Alexandre Passos , Robert Ormandi , George E. Dahl , Geoffrey E. Hinton

Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation

Convolutional neural networks have been widely deployed in various application scenarios. In order to extend the applications' boundaries to some accuracy-crucial domains, researchers have been investigating approaches to boost accuracy…

Machine Learning · Computer Science 2019-05-21 Linfeng Zhang , Jiebo Song , Anni Gao , Jingwei Chen , Chenglong Bao , Kaisheng Ma

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding

This paper explores the use of knowledge distillation to improve a Multi-Task Deep Neural Network (MT-DNN) (Liu et al., 2019) for learning text representations across multiple natural language understanding tasks. Although ensemble learning…

Computation and Language · Computer Science 2019-04-23 Xiaodong Liu , Pengcheng He , Weizhu Chen , Jianfeng Gao

Leveraging Knowledge Distillation for Efficient Deep Reinforcement Learning in Resource-Constrained Environments

This paper aims to explore the potential of combining Deep Reinforcement Learning (DRL) with Knowledge Distillation (KD) by distilling various DRL algorithms and studying their distillation effects. By doing so, the computational burden of…

Machine Learning · Computer Science 2024-04-03 Guanlin Meng

Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification

Deep Neural Networks (DNNs) have significantly advanced the field of computer vision. To improve DNN training process, knowledge distillation methods demonstrate their effectiveness in accelerating network training by introducing a fixed…

Computer Vision and Pattern Recognition · Computer Science 2025-07-01 Daqian Shi , Xiaolei Diao , Xu Chen , Cédric M. John

Knowledge Distillation Applied to Optical Channel Equalization: Solving the Parallelization Problem of Recurrent Connection

To circumvent the non-parallelizability of recurrent neural network-based equalizers, we propose knowledge distillation to recast the RNN into a parallelizable feedforward structure. The latter shows 38\% latency decrease, while impacting…

Signal Processing · Electrical Eng. & Systems 2022-12-12 Sasipim Srivallapanondh , Pedro J. Freire , Bernhard Spinnler , Nelson Costa , Antonio Napoli , Sergei K. Turitsyn , Jaroslaw E. Prilepsky

Fast and Accurate Single Image Super-Resolution via Information Distillation Network

Recently, deep convolutional neural networks (CNNs) have been demonstrated remarkable progress on single image super-resolution. However, as the depth and width of the networks increase, CNN-based super-resolution methods have been faced…

Computer Vision and Pattern Recognition · Computer Science 2018-03-28 Zheng Hui , Xiumei Wang , Xinbo Gao

Noisy Machines: Understanding Noisy Neural Networks and Enhancing Robustness to Analog Hardware Errors Using Distillation

The success of deep learning has brought forth a wave of interest in computer hardware design to better meet the high demands of neural network inference. In particular, analog computing hardware has been heavily motivated specifically for…

Machine Learning · Computer Science 2020-01-15 Chuteng Zhou , Prad Kadambi , Matthew Mattina , Paul N. Whatmough

Designing and Training of Lightweight Neural Networks on Edge Devices using Early Halting in Knowledge Distillation

Automated feature extraction capability and significant performance of Deep Neural Networks (DNN) make them suitable for Internet of Things (IoT) applications. However, deploying DNN on edge devices becomes prohibitive due to the colossal…

Machine Learning · Computer Science 2022-10-03 Rahul Mishra , Hari Prabhat Gupta

Pipe-BD: Pipelined Parallel Blockwise Distillation

Training large deep neural network models is highly challenging due to their tremendous computational and memory requirements. Blockwise distillation provides one promising method towards faster convergence by splitting a large model into…

Machine Learning · Computer Science 2023-01-31 Hongsun Jang , Jaewon Jung , Jaeyong Song , Joonsang Yu , Youngsok Kim , Jinho Lee

Efficient Learned Image Compression Through Knowledge Distillation

Learned image compression sits at the intersection of machine learning and image processing. With advances in deep learning, neural network-based compression methods have emerged. In this process, an encoder maps the image to a…

Computer Vision and Pattern Recognition · Computer Science 2025-09-15 Fabien Allemand , Attilio Fiandrotti , Sumanta Chaudhuri , Alaa Eddine Mazouz

Fast Tensorization of Neural Networks via Slice-wise Feature Distillation

We propose a scalable tensorization framework for neural network compression based on slice-wise feature distillation. Unlike conventional tensor decomposition methods that rely on costly global finetuning, our approach decomposes the…

Machine Learning · Computer Science 2026-05-20 Safa Hamreras , Sukhbinder Singh , Román Orús

Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks

The deep layers of modern neural networks extract a rather rich set of features as an input propagates through the network. This paper sets out to harvest these rich intermediate representations for quantization with minimal accuracy loss…

Machine Learning · Computer Science 2020-03-04 Ahmed T. Elthakeb , Prannoy Pilligundla , Alex Cloninger , Hadi Esmaeilzadeh

Joint Architecture and Knowledge Distillation in CNN for Chinese Text Recognition

The technique of distillation helps transform cumbersome neural network into compact network so that the model can be deployed on alternative hardware devices. The main advantages of distillation based approaches include simple training…

Computer Vision and Pattern Recognition · Computer Science 2020-10-27 Zi-Rui Wang , Jun Du