Related papers: Distilling Interpretable Models into Human-Readabl…

Deep Distilling: automated code generation using explainable deep learning

Human reasoning can distill principles from observed patterns and generalize them to explain and solve novel problems. The most powerful artificial intelligence systems lack explainability and symbolic reasoning ability, and have therefore…

Machine Learning · Computer Science 2021-11-17 Paul J. Blazek , Kesavan Venkatesh , Milo M. Lin

Interpretable Few-Shot Learning via Linear Distillation

It is important to develop mathematically tractable models than can interpret knowledge extracted from the data and provide reasonable predictions. In this paper, we present a Linear Distillation Learning, a simple remedy to improve the…

Machine Learning · Computer Science 2019-10-14 Arip Asadulaev , Igor Kuznetsov , Andrey Filchenkov

Towards a theory of model distillation

Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06,HVD15]. Despite many practical applications, basic questions about the extent to which models can be…

Machine Learning · Computer Science 2024-05-07 Enric Boix-Adsera

A Generic Approach for Reproducible Model Distillation

Model distillation has been a popular method for producing interpretable machine learning. It uses an interpretable "student" model to mimic the predictions made by the black box "teacher" model. However, when the student model is sensitive…

Machine Learning · Statistics 2023-05-01 Yunzhe Zhou , Peiru Xu , Giles Hooker

Distilling Model Knowledge

Top-performing machine learning systems, such as deep neural networks, large ensembles and complex probabilistic graphical models, can be expensive to store, slow to evaluate and hard to integrate into larger systems. Ideally, we would like…

Machine Learning · Statistics 2015-10-09 George Papamakarios

Dataset Distillation

Model distillation aims to distill the knowledge of a complex model into a simpler one. In this paper, we consider an alternative formulation called dataset distillation: we keep the model fixed and instead attempt to distill the knowledge…

Machine Learning · Computer Science 2020-02-26 Tongzhou Wang , Jun-Yan Zhu , Antonio Torralba , Alexei A. Efros

Learning Interpretation with Explainable Knowledge Distillation

Knowledge Distillation (KD) has been considered as a key solution in model compression and acceleration in recent years. In KD, a small student model is generally trained from a large teacher model by minimizing the divergence between the…

Machine Learning · Computer Science 2021-11-16 Raed Alharbi , Minh N. Vu , My T. Thai

Efficient Sub-structured Knowledge Distillation

Structured prediction models aim at solving a type of problem where the output is a complex structure, rather than a single variable. Performing knowledge distillation for such models is not trivial due to their exponentially large output…

Machine Learning · Computer Science 2022-03-10 Wenye Lin , Yangming Li , Lemao Liu , Shuming Shi , Hai-tao Zheng

Structural Knowledge Distillation: Tractably Distilling Information for Structured Predictor

Knowledge distillation is a critical technique to transfer knowledge between models, typically from a large model (the teacher) to a more fine-grained one (the student). The objective function of knowledge distillation is typically the…

Computation and Language · Computer Science 2021-06-03 Xinyu Wang , Yong Jiang , Zhaohui Yan , Zixia Jia , Nguyen Bach , Tao Wang , Zhongqiang Huang , Fei Huang , Kewei Tu

Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use…

Computation and Language · Computer Science 2023-02-02 Chenglong Wang , Yi Lu , Yongyu Mu , Yimin Hu , Tong Xiao , Jingbo Zhu

Knowledge Distillation with the Reused Teacher Classifier

Knowledge distillation aims to compress a powerful yet cumbersome teacher model into a lightweight student model without much sacrifice of performance. For this purpose, various approaches have been proposed over the past few years,…

Computer Vision and Pattern Recognition · Computer Science 2022-03-29 Defang Chen , Jian-Ping Mei , Hailin Zhang , Can Wang , Yan Feng , Chun Chen

Knowledge Distillation Detection for Open-weights Models

We propose the task of knowledge distillation detection, which aims to determine whether a student model has been distilled from a given teacher, under a practical setting where only the student's weights and the teacher's API are…

Machine Learning · Computer Science 2025-10-03 Qin Shi , Amber Yijia Zheng , Qifan Song , Raymond A. Yeh

Data Distillation for Text Classification

Deep learning techniques have achieved great success in many fields, while at the same time deep learning models are getting more complex and expensive to compute. It severely hinders the wide applications of these models. In order to…

Computation and Language · Computer Science 2021-04-20 Yongqi Li , Wenjie Li

Data-to-Model Distillation: Data-Efficient Learning Framework

Dataset distillation aims to distill the knowledge of a large-scale real dataset into small yet informative synthetic data such that a model trained on it performs as well as a model trained on the full dataset. Despite recent progress,…

Computer Vision and Pattern Recognition · Computer Science 2024-11-21 Ahmad Sajedi , Samir Khaki , Lucy Z. Liu , Ehsan Amjadian , Yuri A. Lawryshyn , Konstantinos N. Plataniotis

Distilling Black-Box Travel Mode Choice Model for Behavioral Interpretation

Machine learning has proved to be very successful for making predictions in travel behavior modeling. However, most machine-learning models have complex model structures and offer little or no explanation as to how they arrive at these…

Machine Learning · Statistics 2019-10-31 Xilei Zhao , Zhengze Zhou , Xiang Yan , Pascal Van Hentenryck

Approximation Trees: Statistical Stability in Model Distillation

This paper examines the stability of learned explanations for black-box predictions via model distillation with decision trees. One approach to intelligibility in machine learning is to use an understandable `student' model to mimic the…

Machine Learning · Statistics 2018-08-24 Yichen Zhou , Zhengze Zhou , Giles Hooker

Joint learning of interpretation and distillation

The extra trust brought by the model interpretation has made it an indispensable part of machine learning systems. But to explain a distilled model's prediction, one may either work with the student model itself, or turn to its teacher…

Machine Learning · Computer Science 2020-05-26 Jinchao Huang , Guofu Li , Zhicong Yan , Fucai Luo , Shenghong Li

Multilingual Neural Machine Translation with Knowledge Distillation

Multilingual machine translation, which translates multiple languages with a single model, has attracted much attention due to its efficiency of offline training and online serving. However, traditional multilingual translation usually…

Computation and Language · Computer Science 2019-05-01 Xu Tan , Yi Ren , Di He , Tao Qin , Zhou Zhao , Tie-Yan Liu

Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models

Knowledge distillation methods have recently shown to be a promising direction to speedup the synthesis of large-scale diffusion models by requiring only a few inference steps. While several powerful distillation methods were recently…

Computer Vision and Pattern Recognition · Computer Science 2024-04-08 Nikita Starodubcev , Artem Fedorov , Artem Babenko , Dmitry Baranchuk

MED-TEX: Transferring and Explaining Knowledge with Less Data from Pretrained Medical Imaging Models

Deep learning methods usually require a large amount of training data and lack interpretability. In this paper, we propose a novel knowledge distillation and model interpretation framework for medical image classification that jointly…

Computer Vision and Pattern Recognition · Computer Science 2022-01-13 Thanh Nguyen-Duc , He Zhao , Jianfei Cai , Dinh Phung