Related papers: Generative Adversarial Simulator

Synthetic data generation method for data-free knowledge distillation in regression neural networks

Knowledge distillation is the technique of compressing a larger neural network, known as the teacher, into a smaller neural network, known as the student, while still trying to maintain the performance of the larger neural network as much…

Machine Learning · Computer Science 2023-05-11 Tianxun Zhou , Keng-Hwee Chiam

Data-Free Adversarial Distillation

Knowledge Distillation (KD) has made remarkable progress in the last few years and become a popular paradigm for model compression and knowledge transfer. However, almost all existing KD algorithms are data-driven, i.e., relying on a large…

Machine Learning · Computer Science 2020-03-03 Gongfan Fang , Jie Song , Chengchao Shen , Xinchao Wang , Da Chen , Mingli Song

Knowledge Distillation with Training Wheels

Knowledge distillation is used, in generative language modeling, to train a smaller student model using the help of a larger teacher model, resulting in improved capabilities for the student model. In this paper, we formulate a more general…

Computation and Language · Computer Science 2025-02-26 Guanlin Liu , Anand Ramachandran , Tanmay Gangwani , Yan Fu , Abhinav Sethy

Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation

Knowledge distillation refers to a technique of transferring the knowledge from a large learned model or an ensemble of learned models to a small model. This method relies on access to the original training set, which might not always be…

Machine Learning · Computer Science 2021-02-24 Xiaoyang Qu , Jianzong Wang , Jing Xiao

Knowledge Distillation with the Reused Teacher Classifier

Knowledge distillation aims to compress a powerful yet cumbersome teacher model into a lightweight student model without much sacrifice of performance. For this purpose, various approaches have been proposed over the past few years,…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Defang Chen , Jian-Ping Mei , Hailin Zhang , Can Wang , Yan Feng , Chun Chen

Dual Discriminator Adversarial Distillation for Data-free Model Compression

Knowledge distillation has been widely used to produce portable and efficient neural networks which can be well applied on edge devices for computer vision tasks. However, almost all top-performing knowledge distillation methods need to…

Computer Vision and Pattern Recognition · Computer Science 2021-10-06 Haoran Zhao , Xin Sun , Junyu Dong , Hui Yu , Huiyu Zhou

Hybrid Data-Free Knowledge Distillation

Data-free knowledge distillation aims to learn a compact student network from a pre-trained large teacher network without using the original training data of the teacher network. Existing collection-based and generation-based methods train…

Computer Vision and Pattern Recognition · Computer Science 2024-12-19 Jialiang Tang , Shuo Chen , Chen Gong

Teacher Agent: A Knowledge Distillation-Free Framework for Rehearsal-based Video Incremental Learning

Rehearsal-based video incremental learning often employs knowledge distillation to mitigate catastrophic forgetting of previously learned data. However, this method faces two major challenges for video task: substantial computing resources…

Computer Vision and Pattern Recognition · Computer Science 2023-12-12 Shengqin Jiang , Yaoyu Fang , Haokui Zhang , Qingshan Liu , Yuankai Qi , Yang Yang , Peng Wang

Reinforced Multi-Teacher Selection for Knowledge Distillation

In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation…

Computation and Language · Computer Science 2020-12-15 Fei Yuan , Linjun Shou , Jian Pei , Wutao Lin , Ming Gong , Yan Fu , Daxin Jiang

Large-Scale Generative Data-Free Distillation

Knowledge distillation is one of the most popular and effective techniques for knowledge transfer, model compression and semi-supervised learning. Most existing distillation approaches require the access to original or augmented training…

Machine Learning · Computer Science 2020-12-11 Liangchen Luo , Mark Sandler , Zi Lin , Andrey Zhmoginov , Andrew Howard

Dual Policy Distillation

Policy distillation, which transfers a teacher policy to a student policy has achieved great success in challenging tasks of deep reinforcement learning. This teacher-student framework requires a well-trained teacher model which is…

Machine Learning · Computer Science 2020-06-09 Kwei-Herng Lai , Daochen Zha , Yuening Li , Xia Hu

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use…

Computation and Language · Computer Science 2023-02-02 Chenglong Wang , Yi Lu , Yongyu Mu , Yimin Hu , Tong Xiao , Jingbo Zhu

Zero-Shot Knowledge Distillation in Deep Networks

Knowledge distillation deals with the problem of training a smaller model (Student) from a high capacity source model (Teacher) so as to retain most of its performance. Existing approaches use either the training data or meta-data extracted…

Machine Learning · Computer Science 2019-05-21 Gaurav Kumar Nayak , Konda Reddy Mopuri , Vaisakh Shaj , R. Venkatesh Babu , Anirban Chakraborty

Data-Free Network Quantization With Adversarial Knowledge Distillation

Network quantization is an essential procedure in deep learning for development of efficient fixed-point inference models on mobile or edge platforms. However, as datasets grow larger and privacy regulations become stricter, data sharing…

Computer Vision and Pattern Recognition · Computer Science 2020-05-11 Yoojin Choi , Jihwan Choi , Mostafa El-Khamy , Jungwon Lee

Real-time Policy Distillation in Deep Reinforcement Learning

Policy distillation in deep reinforcement learning provides an effective way to transfer control policies from a larger network to a smaller untrained network without a significant degradation in performance. However, policy distillation is…

Machine Learning · Computer Science 2020-01-01 Yuxiang Sun , Pooyan Fazli

Data-Efficient Ranking Distillation for Image Retrieval

Recent advances in deep learning has lead to rapid developments in the field of image retrieval. However, the best performing architectures incur significant computational cost. Recent approaches tackle this issue using knowledge…

Computer Vision and Pattern Recognition · Computer Science 2020-07-14 Zakaria Laskar , Juho Kannala

Knowledge Distillation Detection for Open-weights Models

We propose the task of knowledge distillation detection, which aims to determine whether a student model has been distilled from a given teacher, under a practical setting where only the student's weights and the teacher's API are…

Machine Learning · Computer Science 2025-10-03 Qin Shi , Amber Yijia Zheng , Qifan Song , Raymond A. Yeh

GOVERN: Gradient Orientation Vote Ensemble for Multi-Teacher Reinforced Distillation

Pre-trained language models have become an integral component of question-answering systems, achieving remarkable performance. However, for practical deployment, it is crucial to perform knowledge distillation to maintain high performance…

Computation and Language · Computer Science 2024-10-16 Wenjie Zhou , Zhenxin Ding , Xiaodong Zhang , Haibo Shi , Junfeng Wang , Dawei Yin

On the benefits of knowledge distillation for adversarial robustness

Knowledge distillation is normally used to compress a big network, or teacher, onto a smaller one, the student, by training it to match its outputs. Recently, some works have shown that robustness against adversarial attacks can also be…

Machine Learning · Computer Science 2022-03-15 Javier Maroto , Guillermo Ortiz-Jiménez , Pascal Frossard

Improving Adversarial Robustness Through Adaptive Learning-Driven Multi-Teacher Knowledge Distillation

Convolutional neural networks (CNNs) excel in computer vision but are susceptible to adversarial attacks, crafted perturbations designed to mislead predictions. Despite advances in adversarial training, a gap persists between model accuracy…

Computer Vision and Pattern Recognition · Computer Science 2025-07-29 Hayat Ullah , Syed Muhammad Talha Zaidi , Arslan Munir